predicting churn in telco industry: machine learning approach - marko mitić

20
Dr. Marko Mitić Business Data Analyst at Telenor Serbia Predicting churn in telco industry: machine learning approach

Upload: institute-of-contemporary-sciences

Post on 06-Jan-2017

21 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Predicting churn in telco industry: machine learning approach - Marko Mitić

Dr. Marko MitićBusiness Data Analyst at Telenor Serbia

Predicting churn in telco industry: machine learning approach

Page 2: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Contents• Introduction to machine learning

• Churn definition & telco data

• Algorithm description

• Data exploration

• Modelling in R language

• Conclusion

Page 3: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Introduction to machine learningSupervised learningEach training example is a pair consisting of an input object and a desired output value

• Regression (real values)• Classification (discrete labels)

Unsupervised learningDraw inferences from datasets data without labeled responses 

• Clustering• Dimensionality reduction

Reinforcement learningAgents ought to take actions in an environment so as to maximize cumulative reward

Page 4: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Introduction to machine learning

Regression Classification

Clustering ReinforcementLearning

Page 5: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Introduction to machine learning

Training set (observed)

Universal set(unobserved)

Testing set(unobserved)

Data acquisition

Practical usage

Classification

Page 6: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Churn definitionChurn rate (sometimes called attrition rate), is a measure of the number of individuals or items moving out of a collective group over a specific period of time

= Customer leaving

Pay TVE-mail/website subscribersLegal sectorRecreation Newspaper subscribers

Page 7: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Telco dataReal telco data available in latest C50 library in R language

Feature engineering: 3/6 months average usage, average total charge,...

Page 8: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Algorithms1. Logistic Regression• In logistic regression the outcome variable is binary, and the

purpose of the analysis is to assess the effects of multiple explanatory variables

Odds of success = P / 1-P = = e α + β1X1 + β2X2 + …+βpXp

The joint effects of all explanatory variables put together on the odds isLogit P = α+β1X1+β2X2+..+βpXp

Page 9: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Algorithms2. Support Vector Machines• SVMs maximize the margin around the

separating hyperplane.• The decision function is fully specified by a

subset of training samples, the support vectors.wTxi + b ≥ 1, if yi = 1wTxi + b ≤ −1, if yi = −1

w2ρ• Margin

Page 10: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Algorithms3. Neural Network

• A neuron network (NN) is a computational model based on the structure and functions of biological neural networks.

•  A neural network usually involves a large number of processing units with the aim of successfully mapping input to output space through iterative process

Page 11: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Algorithms4. Boosting

• Adaboost

Page 12: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Evaluation metricsConfusion matrix

• Accuracy, Precision, Recall

Page 13: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Evaluation metricsROC curve and AUC

Page 14: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Data exploration

Page 15: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Modelling in R (1)Logistic Regression

Page 16: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Modelling in R (1.1)ROC and AUC

Page 17: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Modelling in R (2)Support Vector Machines

Page 18: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Modelling in R (4)BP Neural Networks

Page 19: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Conclusions• 3 machine algorithms for churn prediction are presented

• Logistic Regression and BP Neural Net with boosting gave best results

• Good base for successfull broadcast campaign towards potential churners

Works even better• Implementation of more complex ML algorithms (Random Forest,

Gradient Boosting Machines, Deep NNs)

• Generate hybrid ensemble models

Page 20: Predicting churn in telco industry: machine learning approach - Marko Mitić

[email protected]

Thank you!