predictive modeling for insurers - reserveprism.com · actuarial excellence through engineering...

36
Actuarial Excellence Through Engineering ReservePrism Predictive Modeling for Insurers Kailan Shang CFA, FSA, PRM, SCJP Vice President, Predictive Modeling, Reserve Prism Managing Director, Swin Solutions November 2016

Upload: others

Post on 11-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Predictive Modeling for Insurers

Kailan Shang CFA, FSA, PRM, SCJPVice President, Predictive Modeling, Reserve Prism

Managing Director, Swin Solutions

November 2016

Page 2: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Agenda

• What is Predictive Modeling?

• Predictive Models

• Insurance Applications

• Case Studies

UBI/Telematics

Insurance Product Recommendation

Underwriting

Crime Classification

• Demo

Predictive Modeling

Page 3: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

What is Predictive Modeling?

• The process of using statistical models to predict future trends and behaviors.

• Synonyms: Statistical modeling, regression analysis, data mining, machine learning, data science, etc.

• In the actuarial world, the most widely used predictive model is Generalized Linear Model (GLM).

• But there are a lot more to explore ……

Predictive Modeling

Page 4: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismPredictive Modeling Process

Data

Collection

Data

Processing

Model

Fitting

Model

Testing

Data

Visualization

Structured Data (Numerical)

Unstructured Data (Text, Voice, Image)

Feature Extraction

Missing Data Treatment

Principal Component Analysis

Data Normalization

Categorical Dummy

Classification

Regression

Training Data/ Validation Data

Precision (Type I) /Recall (Type II)

Result Communication

Linkage to Decision-Making

Page 5: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Clustering Analysis: k-means, distribution based clustering, etc.

Principal Component Analysis: Less variables to explain most of the volatility in the data

Unsupervised Learning: We do not know Y

Predictive Models

Page 6: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Association Rules: Item Sets

Beer + Diaper

Unsupervised Learning: We do not know Y

Predictive Models

Base Policy

+ Rider?

+ Options?

Page 7: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Linear Regression: Y = a + bX + ϵ

Generalized Linear Model (GLM): Assuming something different from normal distribution.

•Probability distribution from the exponential family. (Binomial for Logistic distribution)

•Linear predictor η = Xβ.

•Link function g such that E(Y) = μ = g−1(η). (Xb=ln(m/(1-m)) for

logistic regression)

•A lot of distribution to choose: Exponential, Gamma, Inverse Guassian, Poisson, Multinomial, etc.

Supervised Learning: We know Y

Predictive Models

Page 8: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Decision Tree

Supervised Learning: We know Y

Predictive Models

Income

<92.5 >=92.5

High Risk

Dwelling

Status

No Yes

EducationLow Risk

High Low

Low Risk High Risk

Page 9: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Classification and Regression Tree (CART)

Supervised Learning: We know Y

Predictive Models

X3 <10

X7 <36.7

X5 <2.67

X9 <10.75

X12 <599.4

Y = 4.2; N = 56

Y = 3.7 ; N = 336

Y = 0 ; N = 235

Y = 21 ; N = 15

Y = 1.6 ; N = 126Y = 0 ; N = 20

Page 10: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Random Forest: random version of CART

Supervised Learning: We know Y

Predictive Models

CARTsPredicted

ResultVote

Y = 0

Y1 ~ X1

Y = 0 (23)

Y2 ~ X2 Y = 1 Y = 1 (177)

Sampling

Y3 ~ X3 Y = 1 Y= 1

Yn ~ Xn Y = 1

Training

Data

Y ~ X

… … …

X3 <10

X7 <36.7

X5 <2.67

X9 <10.75

X12 <599.4

Y = 1 (5/67)

Y = 0 (15/1)

Y = 0 (6/1)

Y = 1 (1/4)

Y = 1 (2/13)Y = 0 (5/1)

X3 <10

X7 <36.7

X5 <2.67

X9 <10.75

X12 <599.4

Y = 1 (5/67)

Y = 0 (15/1)

Y = 0 (6/1)

Y = 1 (1/4)

Y = 1 (2/13)Y = 0 (5/1)

X4 <20

X12 <599.4

X9 <10.75

X12 <599.4

Y = 1 (6/67)

Y = 0 (12/1)

Y = 0 (6/1)

Y = 1 (1/4)

Y = 1 (1/4)

X4 <20

X6 <6.7

X5 <2.67

X9 <10.75

X12 <599.4

Y = 1 (6/67)

Y = 0 (12/1)

Y = 0 (6/1)

Y = 1 (1/4)

Y = 1 (2/13)Y = 0 (5/1)

Page 11: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Artificial Neural Network (Deep Learning)

Supervised Learning: We know Y

Predictive Models

X1

...

X3

X2

a1

a2

a3

a4

b1

b2

bn-1

bn

Y1

Y1

Input OutputLayers

X1

...

X3

X2

a1

a2

a3

a4

b1

b2

bn-1

bn

Y1

Input OutputLayers

X1

...

X3

X2

a1

a2

a3

a4

b1

b2

bn-1

bn

Y2

Input OutputLayersInput OutputLayers

X1

...

X3

X2

a1

a2

a3

a4

b1

b2

bn-1

bn

Input OutputHidden Layers

ai=f (x1,x2,x3) bi=g (a1,...,a4) Yi=h (b1,...,bn)

Page 12: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Bayesian Network

Supervised Learning: We know Y

Predictive Models

Product Complexity

Penalty CostMisunderstandingCompensation

Level

Misleading Advertisement

A

B D

E

1

34

5

2

P (Complex) = 0.3

P (High|Complex) = 0.8P (Low|Complex) = 0.2

P (High) = 0.5

P (High|Complex) = 0.65P (Low|Complex) = 0.35 P (High|High B & High C & High

D) = 0.95P (Low|High B & High C & High

D) = 0.05... ...

Page 13: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Hidden Markov Model

Supervised Learning: We know Y

Predictive Models

Page 14: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismInsurance Application

PricingMore pricing factors

ReservingClaim classificationCase reserve adequacy assessmentFalse claim identificationClaim closure/reopenessClaim size prediction

UnderwritingHigh risk case identificationAutomatic underwriting

MarketingCustomer retention, renew, or resellPersonalized product recommendationCustomer categorization

Risk AnalysisFraud detectionCredit score

Business DisruptionUBIHealth/Fitness discount

Page 15: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: UBI/Telematics

X y

0 0

-7.4 -7.5

-15 -14.8

-22.6 -22.1

-29.9 -28.9

-37.7 -35.3

-44.7 -42.3

-51.5 -49.6

… …

One second later, the vehicle moved to (-7.4, -7.5), which is 7.4m south and 7.5m west of the starting point.

Each driver has 200 driving paths

Raw Geolocation Data

The paper can be downloaded at

http://reserveprism.com/docs/PredictiveModelingUsingTelematics.pdf

Its Chinese version has been published by the CAA

Page 16: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: UBI/Telematics

Feature Extraction (Pricing/ Risk Assessment)

Feature Explanation

Time The time of a trip, by counting the number of rows in a trip file.

Speedavg Average speed

Speedvol Standard deviation of the speed

Speedmin Minimum speed

Speedmax Maximum speed

Speed10 10th percentile of speed

Speed30 30th percentile of speed

Speed70 70th percentile of speed

Speed90 90th percentile of speed

accelerationavg Average acceleration

accelerationvol Standard deviation of the acceleration

accelerationmin Minimum acceleration

accelerationmax Maximum acceleration

Acceleration10 10th percentile of acceleration

Acceleration30 30th percentile of acceleration

Acceleration70 70th percentile of acceleration

Acceleration90 90th percentile of acceleration

Nofast Number of accelerations greater than 2

Noslow Number of decelerations less than -2

segdistanceavg Average distance per second

segdistancevol Standard deviation of the distance

Length Length of the trip (average speed * time)

Indicator (Y) Whether the trip belongs to the driver. At initial, it is assumed that the 200 trips in a driver’s folder

all belong to the driver.

Page 17: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: UBI/Telematics

We want to predict if a certain driving trip was driven by the policyholder.

Unsupervised Supervised

Driver 1:Training

Data (700 Trips)

Driver 1: 200

Driving Trips

Driver 2: 200

Driving Trips

Driver 2:Training

Data (700 Trips)

Driver N: 200

Driving Trips

Driver 3:Training

Data (700 Trips)

200 Trips

500 Random Trips

Page 18: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: UBI/Telematics

Random Forest

CARTsPredicted

ResultVote

Y = 0

Y1 ~ X1

Y = 0 (23)

Y2 ~ X2 Y = 1 Y = 1 (177)

Sampling

Y3 ~ X3 Y = 1 Y= 1

Yn ~ Xn Y = 1

Training

Data

Y ~ X

… … …

X3 <10

X7 <36.7

X5 <2.67

X9 <10.75

X12 <599.4

Y = 1 (5/67)

Y = 0 (15/1)

Y = 0 (6/1)

Y = 1 (1/4)

Y = 1 (2/13)Y = 0 (5/1)

X3 <10

X7 <36.7

X5 <2.67

X9 <10.75

X12 <599.4

Y = 1 (5/67)

Y = 0 (15/1)

Y = 0 (6/1)

Y = 1 (1/4)

Y = 1 (2/13)Y = 0 (5/1)

X4 <20

X12 <599.4

X9 <10.75

X12 <599.4

Y = 1 (6/67)

Y = 0 (12/1)

Y = 0 (6/1)

Y = 1 (1/4)

Y = 1 (1/4)

X4 <20

X6 <6.7

X5 <2.67

X9 <10.75

X12 <599.4

Y = 1 (6/67)

Y = 0 (12/1)

Y = 0 (6/1)

Y = 1 (1/4)

Y = 1 (2/13)Y = 0 (5/1)

Page 19: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: UBI/Telematics

Page 20: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: UBI/Telematics

Validation

Confusion Matrix

Total Trips Predicted Right Trips Predicted Wrong Trips

Actual Right Trips True Right Trips False Wrong Trips

Actual Wrong Trips False Right Trips True Wrong Trips

Total Trips Predicted Right Trips Predicted Wrong Trips

Actual Right Trips 43 7

Actual Wrong Trips 15 185

A more mature business model with time and actual coordinates.

Page 21: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Insurance Product Recommendation

Data

Demographic

Information

Financial

information

Purchase

History

Claim

History

Communication

history

age, gender, address, zip code,

smoker/nonsmoker, health status,

occupation, marital status

assets, real estate, income,

loans and spending

product type, product name, issue

age, face amount, premium rate, face

amount change, partial withdrawal,

policy loan and product conversion

time, amount, payment

last contact time, reason,

outcome, complaints

Page 22: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Insurance Product Recommendation

Data Processing

1. Non-numerical data Grouping and Converting to dummy

variable

2. Missing data Use the value of the similar records.

• Similarity is measured by the Euclidean distance.

where

X is the data record with missing value for variable l

Y is a complete data record in the dataset.

n is the number of variables in the dataset.

𝑌𝑖 − 𝑋𝑖 2

𝑛

𝑖=1

𝑖 ≠ 𝑙

Page 23: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Insurance Product Recommendation

Model: Artificial Neural Network

Input

Datag (1) Hidden

Layerg (2) Hidden

Layerg (3) Output

The essay is published by the SOA

https://www.soa.org/Files/Research/research-

2016-predictive-analytics-call-essays.pdf

Page 24: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Insurance Product Recommendation

Heuristic Training

Affordability

Satisfaction

Demographic

Info

Financial

Info

Purchase History

Claim

History

Communication

History

Risk Appetite

New Insurance Needs

Page 25: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Cancer Patient Mortality Prediction

The Surveillance, Epidemiology, and End Results (SEER) research data (1972 – 2012)

Demographic(age, gender, race)

Diagnostic(time, tumor size,

type, location)

HistologicalMedical

TreatmentSurvivorship

Models: Linear, Logistic, KNN, CART, Random Forest, ANN

Page 26: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Cancer Patient Mortality Prediction

Page 27: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Cancer Patient Mortality Prediction

ROC Curve: Linear Regression

True Positive Rate

Tru

e N

egativ

e R

ate

0.0

0.4

0.8

1.0 0.6 0.2

ROC Curve: Logistic Regression

True Positive Rate

Tru

e N

egativ

e R

ate

0.0

0.4

0.8

1.0 0.6 0.2

ROC Curve: CART

True Positive Rate

Tru

e N

egativ

e R

ate

0.0

0.4

0.8

1.0 0.6 0.2

ROC Curve: Random Forest

True Positive Rate

Tru

e N

egativ

e R

ate

0.0

0.4

0.8

1.0 0.6 0.2

ROC Curve: KNN(5)

True Positive Rate

Tru

e N

egativ

e R

ate

0.0

0.4

0.8

1.0 0.6 0.2

ROC Curve: ANN(10,5)

True Positive Rate

Tru

e N

egativ

e R

ate

0.0

0.4

0.8

1.0 0.6 0.2

Receiver operating

characteristic curve:

0.5 is used as the threshold

for survival and death.

Page 28: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Cancer Patient Mortality Prediction

Breast Cancer

Linear CART

Variable Weight Variable Weight

Number of regional lymph nodes 32 Age at 2012 42

Age at Diagnosis 8 Age at Diagnosis 16

Number of regional lymph nodes removed or examined

7 Stage information 16

Surgery procedure of primary site 7 Tumor type (positive/negative) 9

Surgery type (site) 6 Insurance Status 4

Race 5 Primary site 2

Age at 2012 4 Reason for no surgery 2

Number of regional lymph nodes that were found to contain metastases

4 Tumor extension 1

Primary site and histology for children

4 Involvement of lymph nodes 1

Stage information 3 Number of regional lymph nodes that were found to contain metastases

1

Page 29: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Cancer Patient Mortality Prediction

Page 30: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Crime Classification

DataSan Francisco Police Department crime incident data

Goal: Given the time, address and geolocation of the

reported crime incidence, predict the crime type to allocate

appropriate resources.

Dates Category DayOfWeek PdDistrict Address X Y

2015-05-13 23:53 WARRANTS Wednesday NORTHERN OAK ST / LAGUNA ST -122.426 37.7746

2015-05-13 23:53 OTHER OFFENSES Wednesday NORTHERN OAK ST / LAGUNA ST -122.426 37.7746

2015-05-13 23:33 OTHER OFFENSES Wednesday NORTHERN VANNESS AV / GREENWICH ST -122.424 37.80041

2015-05-13 23:30 LARCENY/THEFT Wednesday NORTHERN 1500 Block of LOMBARD ST -122.427 37.80087

2015-05-13 23:30 LARCENY/THEFT Wednesday PARK 100 Block of BRODERICK ST -122.439 37.77154

2015-05-13 23:30 LARCENY/THEFT Wednesday INGLESIDE 0 Block of TEDDY AV -122.403 37.71343

2015-05-13 23:30 VEHICLE THEFT Wednesday INGLESIDE AVALON AV / PERU AV -122.423 37.72514

2015-05-13 23:30 VEHICLE THEFT Wednesday BAYVIEW KIRKWOOD AV / DONAHUE ST -122.371 37.72756

2015-05-13 23:00 LARCENY/THEFT Wednesday RICHMOND 600 Block of 47TH AV -122.508 37.7766

Page 31: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Crime Classification

Crime on the Map

Page 32: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Crime Classification

Feature Extraction

To use address in the prediction, words were counted and the frequent

ones were used in the classification.

Page 33: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismCase Study: Crime Classification

Prediction

Models used: Linear Discriminant Analysis, Logistic Model, K-nearest

neighbor, artificial neural network

Page 34: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Demo

Page 35: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrism

Q&A

Page 36: Predictive Modeling for Insurers - reserveprism.com · Actuarial Excellence Through Engineering ReservePrism What is Predictive Modeling? •The process of using statistical models

Actuarial Excellence Through Engineering

ReservePrismAbout Reserve Prism

ReservePrism is an advanced enterprise actuarial loss reserving, pricing and

predictive modeling platform.

The opinions expressed and conclusions reached by the presenter are his own

and do not represent any official position or opinion of ReservePrism.

ReservePrism disclaims responsibility for any private publication or statement by

any of its employees.

Visit us @ http://www.reserveprism.com.