rapid model refresh (rmr) in online fraud detection engine

28
Rapid Model Refresh (RMR) in Online Fraud Detection Engine SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Upload: liuwensui

Post on 21-Nov-2014

2.324 views

Category:

Documents


1 download

DESCRIPTION

Rapid Model Refresh (RMR) in Online Fraud Detection Engine

TRANSCRIPT

Page 1: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

Rapid Model Refresh (RMR)

in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Page 2: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Agenda

Overview

Traditional Tactics Fighting Fraud

Best Practice in PayPal Fraud Detection

Rapid Model Refresh (RMR)

Extensions and Future

Page 3: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Online Fraud in Financial Services

Evolution in Financial Services

• Paper-Based

• In-Branch

• Perceptible Footprint

… …

• Electronic

• Cyber Spaces

• Invisible Marketplace

… …

Emerging Fraud Trends

• Old-Fashion

• Isolated Individual

• Limited-Scope Damage

• Traceable Patterns

… …

• Tech-Savvy

• Organized Gang

• Multi-Billion Loss

• Dynamic Trends

… …

Page 4: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Industry Fact

$1.5

$1.7 $2

.1

$1.9

$2.6

$2.8 $3

.1 $3.7 $4

.0

$0

$1

$2

$3

$4

$5

2000 2001 2002 2003 2004 2005 2006 2007 2008

Lo

ss in

Bill

ion

$

Online Revenue Loss Due to Fraud

Source: Cybersource

Page 5: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Agenda

Objectives

Traditional Tactics Fighting Fraud

Best Practice in PayPal Fraud Detection

Rapid Model Refresh (RMR)

Extensions and Future

Page 6: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Traditional Mitigation Tactics

Heuristic Approach Detect Anomalies Identify Patterns Set Review Criterion

Model-Based Score Rely on Statistical Models (Logit Models / Neural Nets) Generate Suspicion Score Rank Order Transactions

Rule-Based System Employ Machine Learning Algorithms Generate Rule Sets for Segmentation Target High-Risk Segments

Page 7: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Pros and Cons

Heuristic

• Integrate Domain Knowledge• Easy to Implement

• Review-Based & Labor Intensive• Local Solutions without Global View

Scoring

• Successful Industrial Applications• Ideal for Large-Scale Domains

• Long Time-to-Market• Static perspective of Fraud Trends

Rule-Induction

• Fits Dynamic Online Nature• Rapid Development & Deployment

• Require Frequent Refreshes• Burden of High-Volume Rules

Page 8: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Next … …

Now What?

Page 9: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Agenda

Objectives

Traditional Tactics Fighting Fraud

Best Practices in PayPal Fraud Detection

Rapid Model Refresh (RMR)

Extensions and Future

Page 10: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

PayPal's Way to Fight Frauds

PayPal Loss Trend from 200X through 200Y

Page 11: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Multi-Level Detection Engine

Risk Scoring Rule Induction Agent Review

• Modelers developed scoring models with logistic regression / neural network

• Risk score is assigned to each transaction through the system.

• Low-risk transactions will be passed through.

• Analysts built decision trees on high-risk transactions ranked order by risk scoring.

• Most risky segments are further identified by balancing between bad and pass-through rate.

• Most risky transactions identified by rule sets are sent into review queues.

• Queued transactions are prioritized and routed to agents in specific domains.

• Case review and investigation are conducted.

Page 12: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Implementation Challenges

Realities Problems

Fast-Growing International Footprint

Overwhelming Number of Segments & Models

Extremely Rich Data from Diversified Sources

Information Overload instead of Data Mining

Ever-Complicated IT Infrastructure

High Exposures to System Risks

Dynamic Fraud Trends & Smarter Fraudsters

Escalating Model Decay & Deterioration

Page 13: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Data-Driven Model (DDM) Strategy

Conceptual

DDMModular Data Processing

Automatic Model

Development

Dynamic Rule Induction

Real-Time Deployment

Daily Monitoring

Implemented by

Rapid Model Refresh (RMR)

Page 14: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Agenda

Objectives

Traditional Tactics Fighting Fraud

Best Practice in PayPal Fraud Detection

Rapid Model Refresh (RMR)

Extensions and Future

Page 15: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

What’s RMR?

Three Common Layers

Data

Layer

Algorithm

Layer

Deployment

Layer

•Packaged Processing

•Optimized Queries

•Repeatable Stream

•Arbitrary Models

•Standard Evaluation

•Version Controlled

• Model Specs. to XML

•Deploy in Real-Time

•Batched Monitor

Page 16: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

RMR – Data Layer

Enterprise Database

Web Logs3rd-Party Sources

Coarse

Layer

Variables Creation / Imputation / Transformation

Model Development

SAS Data

Fine

Layer

Modular SAS Macros &

Parameterized Scripts

SAS as Wrapper around Shell / SED / BTEQ Scripts

Page 17: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Data Layer at A Glance

SAS Workflow

20+ SAS Macros

Shell Scripts

SED Stream Editor

BTEQ Interface with Teradata

Data Manipulation

Variable Transformation

Create Dynamic SQL

Parallel Execution

Update Parameters in Scripts

Submit SQL

Page 18: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Code Snippet in Data Layer

2

3

1

1. Use SED update parameters in the query

2. Submit the query to Teradata through BTEQ

3. Append the log into a output file

Page 19: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

RMR – Algorithm Layer

Model Evaluation (KS / AUC / … ) Swap Analysis for Rule Sets

Supported by SAS / STAT & SAS / Enterprise Miner

Champion

•Generalized Linear Model

Arbitrary Challengers

•Neural Nets

•Bagging Trees

… …

Bumping

•Stochastic Search for Best Tree(s)

Stump

•Exhaustive Search for Best Cutoffs

Best Models to Production

Page 20: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

A Peek into Algorithm Layer

50% Training

SAS EDA

Macros

WoE Vars

Binned Vars

GLM

NNET

Bagging

Tree2 … … TreeX

25% Testing

25% Validation

SAS Evaluation

Macros

Best Model

Page 21: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

One Tree, Endless Possibilities

Use Cases of Decision Tree in RMR’s View

Bagging Simple Average of Massive Number of Trees Take Advantages of RMR Deployment Layer and Parallel Computing Use as A Challenger to Traditional Logistic Regression

Bumping Stochastic Search from Massive Number of Trees Improve Estimation while Retain Simple Tree Structure Use to Enhance Vallina-Version Tree Development

Stump Exhaustive Search on 1-Dimension Space, e.g. Score Induce 1-Level Binary Tree by Minimizing Gini Impurity Use to Find the Best Score Cutoff while Balancing Review Rate

Page 22: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Pick Winner from Multiple Candidates

Generically Support Arbitrary Number of Score Inputs for Massive Models Evaluation and Deployment

Sample 1 Sample 2 Sample 3 Sample 4 Sample 1 Sample 2 Sample 3 Sample 4

Champion Model 0 0 1 0 55 52 54 54Challenger Model 1 0 0 0 0 58 55 60 58Challenger Model 2 1 1 0 1 61 59 64 62Challenger Model 3 0 0 0 0 57 53 59 56

Champion Model 1 0 1 1 52 46 43 40Challenger Model 1 0 0 0 0 48 42 41 36Challenger Model 2 0 1 0 0 52 45 45 43Challenger Model 3 0 0 0 0 44 38 37 35

Champion Model 1 1 1 1 72 74 74 73Challenger Model 1 0 0 0 0 65 66 67 65Challenger Model 2 0 0 0 0 69 71 72 72Challenger Model 3 0 0 0 0 64 65 67 66

Champion Model 0 1 0 81 76 72 70Challenger Model 1 0 0 0 0 70 64 63 60Challenger Model 2 1 0 1 1 81 75 72 71Challenger Model 3 0 0 0 0 71 63 62 59

SEGMENT 03

SEGMENT 04

SEGMENT 05

SEGMENT 06

SCORECARD EVALUATION SUMMARY

BEST MODEL PREDICTABILITY MEASURE

Page 23: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

RMR – Deployment Layer

Model Specifications

Convert to XML / PMML

Inject into Web Engine

Collect Web Logs in DB

Monitor Daily Scoring Stability

Email Reports to Stakeholders

Perl

Shell

SAS

Page 24: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

A Use Case: Score Monitoring

Lookup Tables

Objectives:

Score Shift System Breakage

Driver Table Log Table

Model / Segment / Owner Lookups

Baseline Distribution

Daily Web Log

SAS Daily Job Scheduled by Cron

Population Stability Reports in Html

Page 25: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Sample Reports

MODEL MODEL DAILY

TYPE NAME VOLUME

GWM 1 1 1 7027 100.00% 0.00% 0.0084

GWM 1 1 2 37388 95.00% 5.00% 0.0068

GWM 1 1 3 33336 100.00% 0.00% 0.0174

GWM 1 1 4 2410 100.00% 0.00% 0.2529

GWM 1 1 5 27924 100.00% 0.00% 0.0121

GWM 1 1 6 13093 100.00% 0.00% 0.0188

Back-End

OVERALL SUMMARY of POPULATION STABILITY INDEX on 05/12/2010

VERSION TIER SEGMENT % VALID%

MISSINGPSI

MIN. MAX. EXPECTED ACTUAL

SCORE SCORE DISTRIBUTION DISTRIBUTION

Low 521 342 5.00% 4.87% 0.0000

521 540 324 5.00% 4.61% 0.0003

540 553 353 5.00% 5.02% 0.0000

553 562 330 5.00% 4.70% 0.0001

562 569 328 5.00% 4.67% 0.0002

569 576 359 5.00% 5.11% 0.0000

576 581 331 5.02% 4.71% 0.0001

581 587 396 5.04% 5.64% 0.0006

587 591 325 4.94% 4.63% 0.0002

POPULATION STABILITY INDEX Details for GWM Segment 2

FREQ. PSI

… …

Overall

Detailed

Page 26: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Agenda

Objectives

Traditional Tactics Fighting Fraud

Best Practice in PayPal Fraud Detection

Rapid Model Refresh (RMR)

Extensions and Future

Page 27: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

Evolution of RMR Paradigm

Past Now Future

Expert Process

• Programmers Pull

Data

• Statisticians Build

Predictive Model

• Engineers Hard-

Code Specification

into On-Line

Environment

• Meets Minimum

Benefit Schedule.

Mechanized Process

• Population and

Performance

Criterion Identified

• A Suite of Challenger

Models Built

Automatically

• Model Specifications

Published in Live

Scoring Platform

• New Models

Deployed in Periodic

Batch

Online Process

• Models Developed &

Deployed with Most

Recent Online Data

Dynamically

• Re-deployment of

New Models not

Needed

Page 28: Rapid Model Refresh (RMR) in Online Fraud Detection Engine

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies. © 2010 SAS Institute Inc. All rights reserved. S55547.0410

2-Path Directions

Hadoop with R Integration

SAS / Teradata in-DB Analytics