graph search for healthcare how algorithms socially disrupt the bad guys while helping socially...
TRANSCRIPT
Graph Search for HealthcareHow algorithms socially disrupt the bad guys while helping socially change health outcomes
Jo Prichard@joprichard
Data Scientist | LexisNexis Risk SolutionsSeptember 2013
See Through Patterns, Hidden Relationships and Networks to Find Opportunities in Big Data.
WHT/082311
http://hpccsystems.com
2
Graph Search vs. Page Rank• Real-time search vs. pre-calculated vertex variables.• Ideal is a combination of both.• Measure the whole graph (Page Rank style) AND search the whole
graph (Graph Search style).
HPCC Systems & LexisNexis social graph• Enterprise ready open-source big data high performance
distributed processing platform.• +- 270 million Active Identities, 4 billion people relationships• 24 billion rows in a distributed partitioned graph.
Simple example of a graph calculation Partition a graph. JOIN is your friend (when it is distributed and not on a RDBMS!) LESS CODE, MORE POWER, MORE VALUE!
Case Study Example :Applying graph analysis to measure socialized prescriptions.• Social Graph prescription stats to measure social density.• Case study results.• Transform insights to actionable data.
Graph Search for Healthcare
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data.
WHT/082311
http://hpccsystems.com
3
Graph Search vs. Page Rank For HealthcareReal-time search vs. pre-calculated vertex variables.
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data.
Trusted Relationships
Graph Search Style for a single patient How many of my associates are smokers? Do I have a licensed medical professional in my social network? Are most of my associates and their associates getting the flu shot this year? How many of my associates live near to where I live? I have a prescription for Vicodin, how many of my associates and their associates also have
prescriptions for Vicodin? How far do my associates travel geographically to fill scheduled drug prescriptions relative to their
other prescriptions?
Page Rank Style for all patients Calculate the answer for every vertex!!
Best of both styles. Are health outcomes negatively affected if your associates smoke? Do personal associations with a licensed medical professional impact hospital readmittance rates? Which elderly or disabled patients are more at risk because they do not live near their support
system? Are there dense social clusters with risk factors for obesity? How normal is it for you and 15 of your close friends to all be receiving Vicodin prescriptions at the
same time and are you all catching a plane from Alabama to Tampa to fill them?
WHT/082311
http://hpccsystems.com
4
LexisNexis Risk Solutions• A division of Reed Elsevier.• 2012 LexisNexis Risk Solutions Revenue = $1.5 billion• Expanding Healthcare vertical with recent acquisitions in the Healthcare space.
HPCC Systems• High Performance Distributed Processing Platform• Open Source, in Production for more than a decade• Utilizes Commodity Hardware
LexisNexis public data social graph• Relationships inferred from 50TB of Public Records Data.• People connected to people, assets, businesses and more.• +- 270 million Active Identities, 4 billion people relationships, • High Value relationships for Mapping trusted networks.
Examples leveraging LexisNexis social graph.• Healthcare
• Medicaid\Medicare Fraud.• Drug Seeking Behavior.• Disease Management and Wellness Programs.
• Financial Services.• Mortgage Fraud.• “Bust out” Fraud.
• Insurance• Staged Accident Fraud.
About LexisNexis Risk Solutions
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data.
WHT/082311
http://hpccsystems.com
5
Relationships in a nutshell.
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data.
Shared Historical Addresses
Shared Business Ownership
Shared Assets(Property, Vehicles
etc.)
Trusted Relationships
No Social Media Data!
WHT/082311
http://hpccsystems.com
6
Simple graph
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data.
Trusted Relationships
Vertexes (Nodes)From To degree
1117906843 1117906843 0.00
1117906843 1166180939 1.00
1117906843 71384691 1.00
1117906843 1572188131 1.00
1117906843 2182832221 1.30
1117906843 2280607022 1.25
1117906843 1773055127 1.20
1117906843 1541607980 1.80
1117906843 1531070616 2.00
240422663 240422663 0.00
240422663 1166180939 1.00
240422663 71384691 1.00
240422663 1572188131 1.00
240422663 2182832221 1.40
240422663 2280607022 1.60
240422663 1773055127 1.75
240422663 1541607980 1.80
240422663 1531070616 2.00
Edges (Links)VertexID f_name l_name age address
1117906843 JAMES ANDERSON 34 P.O. BOX 555 MAIN STREET, BROOKLYN, NY
1166180939 JANET JACKSON-ANDERSON 36 P.O. BOX 555 MAIN STREET, BROOKLYN, NY
71384691 JAN HUNT 39 SUITE 202, MAIN STREET, BROOKLYN, NY
1572188131 GARY JACKSON 45 SUITE 204, MAIN STREET, BROOKLYN, NY
2182832221 KEVIN PIETERSON 34 SUITE 143, MAIN STREET, BROOKLYN, NY
2280607022 KENNY JACKSON 43 SUITE 322, MAIN STREET, BROOKLYN, NY
1773055127 HARRY JAMESON-ANDERSON 41 21 JUMP STREET, BROOKLYN, NY
1541607980 JEFF CANAVAN 31 32 WISTERIA LANE, BROOKLYN, NY
1531070616 BEVERLY NAGLE 32 32 WISTERIA LANE, BROOKLYN, NY
240422663 MIKE JONES 36 3215 VILLAGE CIR, GREENWICH VILLAGE, NY
1166180939 JANET JACKSON-ANDERSON 36 P.O. BOX 555 MAIN STREET, BROOKLYN, NY
71384691 JAN HUNT 39 SUITE 202, MAIN STREET, BROOKLYN, NY
1572188131 GARY JACKSON 45 SUITE 204, MAIN STREET, BROOKLYN, NY
2182832221 KEVIN PIETERSON 34 SUITE 143, MAIN STREET, BROOKLYN, NY
2280607022 KENNY JACKSON 43 SUITE 322, MAIN STREET, BROOKLYN, NY
1773055127 HARRY JAMESON-ANDERSON 41 21 JUMP STREET, BROOKLYN, NY
1541607980 JEFF CANAVAN 31 32 WISTERIA LANE, BROOKLYN, NY
1531070616 BEVERLY NAGLE 32 32 WISTERIA LANE, BROOKLYN, NY
Attributes & VariablesAge, Spend, Claim Velocity…
Degree, Type of Relationship, Date Range..
Now imagine you have 270 Million Vertexes and 24 Billion Edges.
WHT/082311
http://hpccsystems.com
7
Simple example in ECL of a graph calculation in scale
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data.
Trusted Relationships
import SNA, Person, Healthcare;
Edges := Person.Clusters; // a dataset containing centroid to vertexes within 2 degrees.Transactions := Healthcare.PrescriptionTransactions; // prescriptions for people ids.
// Distribute both datasets across all nodes and do a distributed join (not indexed)ClusterTransactions := JOIN(Edges, Transactions, left.ToId=right.PersonId, HASH);
// Calculate the number of prescription by drug name within 2 degrees of every centroid (person)ClusterStats := TABLE(ClusterTransactions, {FromId, generic_drug_name, prescription_count := COUNT(GROUP); prescription_1degree_count := COUNT(GROUP, degree <= 1); prescription_2degree_count := COUNT(GROUP, degree > 1 and degree <= 2) });OUTPUT(ClusterStats(drug_generic_name='HYDROCODONE'), 200, -prescription_count);
Top 200 patients within a social network with a high volume of patients receiving vicodin prescriptions.
It is just a JOIN and an AGGREGATION
WHT/082311
http://hpccsystems.com
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data. 88
KEY INDICATORS Tight social group of people who
appear to be well connected to each other.
Multiple family groups receiving HYDROCODONE.
Cluster Stats put this group socially in the 0.0005% of 1million+ HYDROCODONE prescriptions.
One of the Doctors tied to the HYDROCODONE prescriptions in the cluster is also socially a member of this social group.
HYDROCODONE CLUSTER
drug_generic_name drug_count
HYDROCODONE BITARTRATE/ACETAMINOPHEN 21
SIMVASTATIN 14
FLUTICASONE PROPIONATE -Q7PX 12
LEVOTHYROXINE SODIUM 8
ZOLPIDEM TARTRATE 7
CEPHALEXIN MH 7
CIPROFLOXACIN HYDROCHLORIDE 7
CITALOPRAM HYDROBROMIDE -H2SX 7
CYCLOBENZAPRINE HCL 7
ALENDRONATE SODIUM 7
METOPROLOL TARTRATE 6
AZITHROMYCIN 6
ALBUTEROL SULFATE 6
IBUPROFEN 6
LISINOPRIL 6
LORAZEPAM 6
ATORVASTATIN CALCIUM 5
TRAZODONE HCL -H7EX 5
BECLOMETHASONE DIPROPIONATE 5
FAMOTIDINE 5
GUAIFENESIN/CODEINE PHOS -B4SX 5
AMOXICILLIN TRIHYDRATE 5
METRONIDAZOLE 4
NORGESTIMATE-ETHINYL ESTRADIOL 4
SULFAMETHOXAZOLE/TMP 4
TAMSULOSIN HCL -Q9BX 4
VARDENAFIL HCL -F2AX 4
WARFARIN SODIUM 4
ACYCLOVIR 4
WHT/082311
http://hpccsystems.com
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data. 99
INTERESTING HYDROCODONE CLUSTER
MIKE JONES MD Is the prescribing doctor who prescribed
Vicodin to patients in the target social cluster (James Anderson)
He is a member of the same social cluster Also personally filled a vicodin
prescription for himself.
Question: Is it normal for you and 15 of your associates to all receive a prescription for vicodin within the same short timespan?
Relationships are from public records (non-obvious in the healthcare data domain)
WHT/082311
http://hpccsystems.com
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data. 1010
SOCIALIZATION OF PRESCRIPTIONS:Social vs Non-Social Drugs
Number of prescriptions by social association.
Highlights which drugs show higher levels of socialization.
Highlights outliers and anomalous social patterns
Provides new insight and context at a social drug level.
Not all drugs are created “socially equal”.
Almost every prescription is in social isolation (> 96%)
Large % of prescriptions show socialization (long tail)
KEY1: Means the number of prescriptions for that drug that are the ONLY prescription of that type within the social group.2: Means the number of prescriptions for that drug that are within a social group where there is one other member receiving a prescription for that drug.3: Means the number of prescriptions for that drug that are within a social group where there are two other members receiving a prescription for that drug.And so on…
WHT/082311
http://hpccsystems.com
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data. 1111
SOCIALIZATION OF PRESCRIPTIONS:Social vs Non-Social Drugs
DIGOXIN
SPIRONOLACTONE
METHOTREXATE SODIUM
ISOSORBIDE MONONITRATE
EZETIMIBE
TAMSULOSIN HCL
NIFEDIPINE
DILTIAZEM HCL
FINASTERIDE
ETANERCEPT
ENOXAPARIN SODIUM
ARIPIPRAZOLE
FINASTERIDE -Q9BX
RISPERIDONE
ISOSORBIDE DINITRATE
DIVALPROEX SODIUM
LEVETIRACETAM
PROPRANOLOL HCL
ANASTROZOLE
DOXAZOSIN MESYLATE
PHENYTOIN SODIUM EXTENDED
ESTROGENS, CONJUGATED VAG
ESTROGENS,CONJUGATED -Q4KX
ESTRADIOL -Q4KX
NIACIN
INSIGHTS INTO SOCIAL SPREAD OF PRESCRIPTION BRANDS
Understand what is normal per drug.
Detect and highlight social outliers
Develop an exclusion list for legitimately social drugs (e.g. Antibiotics & Vaccines)
At a drug name level measure unusual social spread.
More quickly see unusual drugs patterns socially.
Might indicate recruitment or drug seeking behavior.
Strategically focus on problematic prescription types from a social spread perspective not an individual patient volume perspective.
Within all the claims focus on the smaller subset of those that are too social.
HYDROCODONE BITARTRATE/ACETAMINOPHEN
SIMVASTATIN
HYDROCODONE-ACETAMINOPHEN
IBUPROFEN
LISINOPRIL
ALBUTEROL SULFATE
ATENOLOL
FLUTICASONE PROPIONATE -Q7PX
OMEPRAZOLE
HYDROCHLOROTHIAZIDE
AMLODIPINE BESYLATE
LEVOTHYROXINE SODIUM
GLUCOSE BLOOD
METFORMIN HYDROCHLORIDE
FLUTICASONE PROPIONATE (N
METFORMIN HCL
BLOOD SUGAR DIAGNOSTIC
LOSARTAN POTASSIUM
GLIPIZIDE
PREDNISONE
AZITHROMYCIN
AMOXICILLIN
LANCETS
GUAIFENESIN/CODEINE PHOS -B4SX
AMOXICILLIN TRIHYDRATE
CYCLOBENZAPRINE HCL
LORAZEPAM
WHT/082311
http://hpccsystems.com
See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data. 1212
SOCIALIZATION OF PRESCRIPTIONS:Social Prescription Patterns = Social Health Conditions
Number of prescriptions by social association.
identify social clusters with a prescription pattern associated with same health conditions.
Opportunity for strategic social intervention to influence health outcomes.
If you could identify the specific segment of your population that fit this social model, how would you leverage this opportunity?
KEY1: Means the number of prescriptions for that drug that are the ONLY prescription of that type within the social group.2: Means the number of prescriptions for that drug that are within a social group where there is one other member receiving that prescription drug.3: Means the number of prescriptions of that drug that are within a social group where there are two other members receiving that prescription drug.And so on…
1 2 3 4 5 6 7 8 9 10 11 120
100000
200000
300000
400000
500000
OMEPRAZOLE
Social Distribution
1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
200000
400000
600000
800000
SIMVASTATIN
Social Distribution
1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
100000200000300000400000500000600000700000
ATENOLOL
Social Distribution
WHT/082311
http://hpccsystems.com
13See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data.
Trusted Relationships
SOCIALIZATION OF PRESCRIPTIONS:Mapping the spread and density of social prescriptions
1313
VARIABLES THAT FOCUS ON CROWDSOURCING PRESCRIPTIONS
Identify patient social groups with abnormal prescription densities.
Identify prescribers with unusual social prescription patterns to social groups.
patients see them as an easy source of prescriptions?
Doctors that are more free with prescriptions for family and friends?
Sign of a larger fraud scheme?
Identify specific areas in the graph where there is an opportunity for social disruption.
Dense social clusters with similar health issues e.g. obesity, diabetes
WHT/082311
http://hpccsystems.com
14See through Patterns, Hidden Relationships and Networks to find Opportunities in Big Data.
Trusted Relationships
In Summary
1414
Benefits of Graph Analysis in Scale for Healthcare Extremely rich source of new perspectives and insight.
Value in cross domain data in scale (you have to have the data, and we do).
Straightforward to roll your own with the new breed of high performance distributed big data processing systems.
Opportunities to disrupt healthcare networks
Change healthcare outcomes.
Tackle organized fraud networks in scale.
30 mins is too short for this topic!
@joprichard