credit scoring with deep learning · payment history 8. credit scoring ... easier / more...
TRANSCRIPT
Credit Scoring with Deep LearningHåvard Kvamme
1
Kjersti AasØrnulf Borgan
Håvard Kvamme Nikolai Sellereite
Steffen Sjursen
2
3
Credit Scoring7
Credit Scoring
➢ Determine loan eligibility
➢ Evaluate existing loans
○ Easier / more information
■ Payment history
8
Credit Scoring
➢ Determine loan eligibility
➢ Evaluate existing loans
○ Easier / more information
■ Payment history
➢ Mortgage
➢ Car loan
➢ Credit Card
➢ etc.
9
Credit Scoring
➢ Determine loan eligibility
➢ Evaluate existing loans
○ Easier / more information
■ Payment history
➢ Mortgage
➢ Car loan
➢ Credit Card
➢ etc.
10
➢ Estimate default probabilities
➢ Statistical prediction / machine
learning
➢ Objective:
○ Default within one year (2, 3,
etc.)
11
Credit Scoring
➢ Current balance on accounts
➢ Loan balances
➢ Previous delinquencies
➢ Age
➢ Profession
➢ Salary
➢ etc.
12
Credit Scoring
So…
What is new?
13
In 2012, the average Norwegian made 323 card transactions, where 71% of the value transferred was through debit payments.
(Norges-Bank, 2012)
14
Customer transaction time series
➢ Checking
➢ Savings
➢ Credit card
➢ Checking: Number of transactions
➢ Checking: Into checking
15
Time series classification
➢ Create features
○ Mean, std, max, min, etc
○ TS analysis features
■ E.g parms. from ARIMA
○ DFT, DWT
16
Time series classification
➢ Create features
○ Mean, std, max, min, etc
○ TS analysis features
■ E.g parms. from ARIMA
○ DFT, DWT
➢ Data-driven features
○ MLP
○ Convolutional Neural Nets
○ Recurrent Neural Nets
17
Deep Learning
18
Neural Networks
➢ Differentiable transformations
○ E.g. f(x) = x * w
➢ Differentiable loss function
➢ Backpropagation (chain rule) -> LEGO
19
Neural Networks➢ Each block (transform) needs:
○ Forward pass (perform transformation)
○ Calculate gradient from BP-loss
○ Calculate BP-loss
➢ Extremely flexible
○ Classification / regression
○ Encoding
○ Generation (image, text, sound)
○ etc.
20
Colorization
http://richzhang.github.io/colorization/
21
Translation
https://research.googleblog.com/2015/07/how-google-translate-squeezes-deep.html
22
Playing Games
https://deepmind.com/research/alphago/
23
MLP
24
MLP
w1
Series
365
25
MLP
w1
Series
365
26
MLP Convolutions
w1
Series
365
w1
Series
365
27
MLP Convolutions
w1
Series
365
w1
Series
365
28
MLP Convolutions
w1
Series
365
w1
Series
365
29
MLP Convolutions
w2
Series
365
w1
Series
365
30
MLP Convolutions
w2
Series
365
w2
Series
365
31
MLP Convolutions
w2
Series
365
w2
Series
365
32
MLP Convolutions
w2
Series
365
Series
365
33
MLP Convolutions
w2
Series
365
Series
365
34
MLP Convolutions
w2
Series
365
Series
365
35
MLP Convolutions
w2
Series
365
Series
365
36
MLP Convolutions
w2
Series
365
Series
365
37
Our Architecture
38
Our Architecture
MLP
39
Our Architecture
MLP
Logistic regression
40
41
http://yosinski.com/deepvis
42
http://yosinski.com/deepvis
43
http://yosinski.com/deepvis
44
http://yosinski.com/deepvis
45
http://yosinski.com/deepvis
Data
46
Housing prices in Norway have generally increased steadily since 2003, and thus, the mortgage market has seen few defaults.
(Finanstilsynet, 2016)
47
Data
48
Results
49
➢ Increase the “low risk” group from 80% to 95%.
➢ 50% of defaults can be found in the 1% highest
risk group.
➢ Not restricted to mortgages.
➢ Is only one part of the full mortgage risk model!
50
ROC Curve➢ TP: True Positive rate
○ TP / P
➢ FP: False Positive rate
○ FP / N
➢ AUC: Area Under Curve
TP r
ate
FP rate
51
ROC Curve➢ TP: True Positive rate
○ TP / P
➢ FP: False Positive rate
○ FP / N
➢ AUC: Area Under Curve
TP r
ate
FP rate
Example:60% of defaults as default20% of non-default as default
52
Architectures
53
Architectures
DNB current risk model: 0.866
54
55
Size of Data
56
➢ Min
➢ Max
➢ Avg
➢ Std
➢ Missing
➢ Scaled versions
➢ Combinations
Random Forests
57
Random Forests
58
Random Forests
59
Questions?
60