society of insurance research, 3rd party data
DESCRIPTION
October 13th 2012 presentation by Kevin McCarthy at the 42nd annual Society of Insurance Research conference in Pittsburgh, PATRANSCRIPT
Not for Public Use
3rd Party DataWho has it, how to get it and use it
Pittsburgh PAOctober 10th, 2012
2
Hi, I’m Kevin
Not for Public Use
compdataI live here I like
this I work on this
3
Focus @ Risk Metrics
Not for Public Use
Data Cloud
Mobile
4
Data
Not for Public Use
¨ Average 1 Terabyte net new monthly
¨ 209 million records processed YTD 2012
¨ Mange data on 4+ million businesses
¨ 16 people
5
“Cloud”
Not for Public Use.0064%
6
Mobile
Not for Public Use
7
Second Day Presenter
Not for Public Use
8
Dealing with Big Data through Analogies
Not for Public Use
9
Who do we have in today?
Not for Public Use
Show of hands…
¨ Marketing
¨ Distribution
¨ Business Intelligence
¨ Underwriting
¨ Claims
¨ Data Scientists?
10
Pioneers
Not for Public Use
11
3rd Party Data
Not for Public Use
Who has it, how to get it and how to use it.
¨ You have internal data
¨ What else is out there?
¨ How do you get access to it?
¨ Acquire, integrate and leverage?
12
3rd Party Data
Not for Public Use
Why?
¨ Models are made up of variables likely to influence future
behavior or results.
¨ The more relevant data you collect of the right type and at
the right level, the more accurate the prediction.
¨ Accessible
¨ Structured
¨ Scalable
“We don’t win because we have better algorithms, we win because we have better data”
- Larry Page, Google
13
Application Potential
Not for Public Use
14
Who is 3rd Party
Not for Public Use
15
Data Markets & Data as a Service (DaaS)
Not for Public Use
¨ SaaS + PaaS + 3rd Party = DaaS
¨ Recent emergence
¨ Growing trend
¨ Not yet “one stop shop”
Recent emergence of data marketplaces.
16
Access and Delivery Methodologies
Not for Public Use
¨ Download
¨ Views
¨ Batch
¨ API
¨ Scrape
17
Normalization
Not for Public Use
Preparation for integration
Address Triangulation
CASS, NCOA and Lat/long
Initial Data Set
Name Standardisatio
n
18
“Fuzzy Logic”
Not for Public Use
Mathematical processes that determine the similarities between data sets
19
“Fuzzy Logic”
Not for Public Use
Same, even though they look different:
¨ “McDonald’s Restaurant” and “Mc Donalds Family Rstrnt”
¨ “Starbucks Coffee Co” and “Starbucks”
¨ “Wendys Old Fashion Hmbrgrs” and “Wendy’s Restaurant”
Science + Art
Different, even though they look similar:
¨ “Jim S. Starbuck” (CPA) and “Jim St. Starbuck” (Starbucks on
Jim St.)
¨ “Wendy’s” vs. “Wendy’s Donuts” vs. “Nails by Wendy”
¨ “Wendys Old Fashion Hmbrgrs” and “Wendy’s Restaurant”
20
API
Not for Public Use
So what's an API anyway?
¨ Application Programming Interface.
¨ Commands and formats for standardized program
communicationProvidersAnalogies
21
APIs and Data-as-a-Service (DaaS)Scalable access to data
Not for Public Use
URL API Query
sTe52D4spwCvnAX47RpBHhz608i
XML or JSON Response
22
Property ExampleData and Match Overview
¨ 161,560 records in from client¨ 9,719 unable to be standardized¨ 2,052 not in coverage ¨ 94,181 address match (62.62%)¨ 5,074 mailing address match
(3.37%)
23
Workers Compensation ExampleData and Match Overview
*For Discussion: RM File, Optimizer Mismatch…
¨ 50,691 policy matches (31.37%)¨ 41,014 with incumbent carrier of
record¨ 38,513 with effective date¨ High effective date correlation
to Hanover effective date¨ 6,982 pickup up by RM n-gram
match algorithm*¨ 43,000 matches available via
APICorrelation to other line X-Date
24
Auto ExampleData and Match overview
¨ 31,560 current customers with 1 or more vehicle
¨ 293,100 Total Insurable Vehicles
Auto Outliers (Examples)
¨ Lease Plan USA, Alpharetta GA (34,218)¨ New Jersey Transit Corp, Newark, NJ
(4,050)¨ GSP Transportation, Greer, SC (1,429)¨ Frac Tech Services, Cisco, TX (1,162)¨ City of Houston, Houston, TX (1,125)
84%
16%
Example Record Set
Not for Public Use 25
Type Field Data
Base Recor
d Information
Unique ID NC428
Description L. L. Vann Electric, Inc.
Address 833 Purser Dr
City Raleigh
STate NC
ZIP 27603
Telephone 9197722567
County WAKE
Emp Total 100-499
Year Started 1987
SIC 1731 - Electrical Contractor
Work Comp
Effective Date 12/31
Effective Month 12
NAIC Carrier Number 31325
NAIC Carrier Name ACADIA INSURANCE COMPANY
NAIC Group Number 98
NAIC Group Name WR BERKLEY CORPORATION GROUP
WC Class Code 3179 - Electrician
Type Field Data
Property Ownersh
ip Informati
on
Current Owner Name L L VANN ELECTRIC INCStreet Address 833 PURSER DR
Mailing City RALEIGH Mailing State NC Mailing Zip 27603
Total Assessed Value $715,482 Assessed Improvement Value $539,064
Assessed Land Value $176,418 Assessment Year 2010Total Market Value $715,482
Market Value: Improvement $539,064 Market Value: Land $176,418 Market Value Year 2010
Original Date of Contract 12.02.1997Sales Price $390,000 Year Built 1987
Zoning SBOriginal Lot Size or Area 1.62 AC
Building Area 4,800 No of Buildings 2
No of Stories 1
Commercial Auto
Data
Car Fleet 2Truck Fleet 54Total Fleet 56
Lower Middle (Car) 1Upper Middle (Car) 1
Heavy Duty Station Wagon 2Window Van (Passenger) 2
Mini Sport Utility 2Midsize Pickup 45
Bus 3
26
Sourcing
Not for Public Use
Continually gather insight
¨ Identification
¨ Evaluation
¨ Implementation
¨ Management
¨ Post Implementation
¨ Source of data
¨ Supply chain
¨ Update frequency
¨ Distribution timeframe
¨ Test support
¨ Rent vs. Acquire
PROCESS IMPORTANT QUESTIONS
27
Closing
Not for Public Use
More analogies