square's machine learning infrastructure and applications - rong yan
DESCRIPTION
http://www.hakkalabs.co/articles/squares-machine-learning-infrastructure-applicationsTRANSCRIPT
![Page 1: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/1.jpg)
May 15, 2014
!Rong Yan
Machine Learning @ Square
![Page 2: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/2.jpg)
Birth of Square
![Page 3: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/3.jpg)
Payment
StandReader
Payment Device Payment Aggregation
Risk Model
![Page 4: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/4.jpg)
Payment
Commerce Cash
Market
![Page 5: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/5.jpg)
Our Mission Make commerce easy.
![Page 6: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/6.jpg)
Payment
Data
Commerce
The Next Big Thing
![Page 7: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/7.jpg)
3M+ Readers
$15B+ Annualized
Scale
![Page 8: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/8.jpg)
Offline and Online
Amount
Location
Item Desc.
Card #
Credit Score
Friends
Activity History
Inventory
![Page 9: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/9.jpg)
Sales Volume
![Page 10: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/10.jpg)
Haircut Price
![Page 11: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/11.jpg)
Turn Data into Business Value
FraudDetection
BusinessInsight
CustomerRelation
InformationDiscovery
![Page 12: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/12.jpg)
Fraud Detection @ Square
![Page 13: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/13.jpg)
Fraud Detection in the payment flow
Bank
Clears for
settlement
Suspect ~2000 sellers
Risk OpsTransaction review
150,000 active sellers per day
Risk ML Fraud Detection
![Page 14: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/14.jpg)
Payments
near-real-time
ML Architecture
Merchant
Devices
Bank Accounts
Machine Learning
(300+ features)Suspicions
![Page 15: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/15.jpg)
Card not present: Yes
Pan Diversity: 0.05
Use iPhone: No
Feature Generation
![Page 16: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/16.jpg)
Easy to interpret
!
Dimension reduction !
!
Very powerful in ensemble
Decline Rate >= 0.1
NoYes
Amount <= $10000
NoYes
Business Type = Auto repair
NoYes
0.9 0.6
Decision Tree Model
![Page 17: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/17.jpg)
Random Forests: Decision Tree Ensemble
Decline Rate <= 0.1
NoYes
Amount <= $10000
Business Type = Auto repair
0.9 0.6
Tree 1 Tree N
Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.
Mode for classification = Bad Average for regression = 0.63
NoYes
NoYes
Success Rate <= 0.2
NoYes
Age >= 20
Amount <= $1000
0.4 0.7
NoYes
NoYes
Decline Rate <= 0.3
NoYes
Amount <= $20000
Age <= 22
0.8 0.6
NoYes
NoYes
Tree 2
Bad, 0.9 Good, 0.4 Bad, 0.6
![Page 18: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/18.jpg)
Random Forests - Build each Tree
All data
![Page 19: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/19.jpg)
Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.
All dataSamples
Random Forests - Build each Tree
![Page 20: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/20.jpg)
Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.
Features
Dollar Amount
Connected with bad user
Business Type
Decline Rate
Time of Day
Location
Randomly select sqrt(n) features
All dataSamples
Random Forests - Build each Tree
![Page 21: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/21.jpg)
Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.
Features
Dollar Amount
Connected with bad user
Business Type
Decline Rate
Time of Day
Location
Randomly select sqrt(n) features
Best split: feature and value
Decline Rate <= 0.1
NoYes
0.4 0.6
All dataSamples
Random Forests - Build each Tree
![Page 22: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/22.jpg)
Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.
Features
Dollar Amount
Connected with bad user
Business Type
Decline Rate
Time of Day
Location
Randomly select sqrt(n) features
Best split: feature and value
Decline Rate <= 0.1
NoYes
0.4 0.6
All dataSamples
Grow Tree Grow Tree
Random Forests - Build each Tree
![Page 23: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/23.jpg)
Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.
Features
Dollar Amount
Connected with bad user
Business Type
Decline Rate
Time of Day
Location
Randomly select sqrt(n) features
Best split: feature and value
Decline Rate <= 0.1
NoYes
0.4 0.6
All dataSamples
Grow Tree Grow Tree
When sample size is small STOP
Random Forests - Build each Tree
![Page 24: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/24.jpg)
Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.
Features
Dollar Amount
Connected with bad user
Business Type
Decline Rate
Time of Day
Location
Randomly select sqrt(n) features
Best split: feature and value
Decline Rate <= 0.1
NoYes
0.4 0.6
All dataSamples
Grow Tree Grow Tree
When sample size is small STOP
Repeat these steps multiple times to create a forest
Random Forests - Build each Tree
![Page 25: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/25.jpg)
Boosting Trees
Tree 1
![Page 26: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/26.jpg)
Boosting Trees
Tree 1 Tree 2
Help Tree 1
![Page 27: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/27.jpg)
Boosting Trees
Tree 1 Tree 2 Tree 3 Tree 4
Help Tree 1
Help Tree 1, 2
Help Tree 1, 2, 3
Stop when no help needed
0 weights all samples
![Page 28: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/28.jpg)
Boosting Trees
Tree 1 Tree 2 Tree 3 Tree 4
Help Tree 1
Help Tree 1, 2
Help Tree 1, 2, 3
8.0 -2.0 1.0 0.57.5 = + + +
![Page 29: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/29.jpg)
Boosting Trees - AlgorithmObjective function:
Loss
Friedman, J. H. "Greedy Function Approximation: A Gradient Boosting Machine." 1999
![Page 30: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/30.jpg)
Precision at a fixed recall level
Results - Precision
Model April May June
Random Forest 76% 77% 80%
Boosting Trees 85% 82% 88%
+11.8% +6.5% +10%
![Page 31: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/31.jpg)
Results - Fraud Detection Recall
# Payments to Reject
Frau
d $
Pre
vent
ed Easy
Hard
Medium
![Page 32: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/32.jpg)
Data Sampling
Highly biased in label distribution - Less than 1 in 1000 !Weighted training - Higher weights on positive samples => oscillation - Lower weights on negative samples => no real gain !Solution - Keep negative:positive ratio to be 3:1 - 10:1 - Scale the final model if calibration is needed !Fewer data requires fewer resources to train !Observed +10% improvement from 20:1 to 3:1
![Page 33: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/33.jpg)
ProductionalizeMachine Learning
![Page 34: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/34.jpg)
‣Ruby-on-Rails + MySQL ‣MySQL replication ‣Tied to production schema ‣Hard to do complex analysis
Startup Architecture
![Page 35: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/35.jpg)
‣ Jave services ‣APIs ‣HDFS
Scale it up: SOA + Data Warehouse
![Page 36: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/36.jpg)
Scale it up: Data Transport
‣Append-only feeds ‣Kafka ‣Replication ‣Protocol buffers
![Page 37: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/37.jpg)
Payments
Highly Available
Merchant
Devices
Bank Accounts
Suspicions
![Page 38: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/38.jpg)
Parallel Environments and Data Integrity
Blue
Green
VIPupstream
![Page 39: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/39.jpg)
Square Random Forest Learning Management
Recommendation
Other ML @ Square
![Page 40: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/40.jpg)
Square Random Forest
RF Learner Implementation Time (Train / Test)
RiskML Random Forest (Built on Scikit-Learn)
C / Cython / Python (Open Source + Square Code) 72 minutes
WiseRF C++ (Proprietary) 23 minutes
Square Random Forest Java (Square Code) 15 minutes
Note: time reported on 3M training and 15M testing data
![Page 41: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/41.jpg)
Learning Management System
‣ Support non-sophisticated users
‣ Fast ad-hoc analytics
‣ Accessible to everyone for easy
model generation and evaluation
‣ Tracks results to ensure different
models can be compared
![Page 42: Square's Machine Learning Infrastructure and Applications - Rong Yan](https://reader034.vdocuments.mx/reader034/viewer/2022052522/554f45a4b4c905524c8b45ad/html5/thumbnails/42.jpg)
Square Market Recommendation
10x conversion rate vs. random baseline