lecture 5: customer lifecycle management – classification ... · classification problems...
TRANSCRIPT
![Page 1: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/1.jpg)
Business Data Analytics
Lecture 5: Customer Lifecycle Management –Classification Problems
MTAT.03.319
The slides are available under creative common license. The original owner of these slides is the University of Tartu.
1
![Page 2: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/2.jpg)
Outline
• Classification and its applications to CLM
• Classification with logistic regression
• Classification with decision trees
• Black-box classification models
• Measuring the quality of classification models
• Pitfalls: class imbalance & overfitting
2
![Page 3: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/3.jpg)
Unsupervised learning Supervised learning
Feature 1
Fea
ture
1
Fea
ture
2
Feature 1
3
![Page 4: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/4.jpg)
Raw Dataset
Regressor
/
Class
10
55
23
12
Features
Sam
ples
Raw Dataset
Classifier
Features
Sam
ples
Quantity
Supervised LearningRegression vs. classification
Sample
Sample
Class probability (confidence)
0.78
38Preprocessing
Preprocessing Training
Training Prediction
Prediction
Note: A classifier can be trained to classify each sample into more than two classes.4
![Page 5: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/5.jpg)
Classification in Customer Lifecycle Management
Lead propensity Cross-sell/up-sell propensity
Response modeling / buying
propensity5
![Page 6: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/6.jpg)
Applying classification
• We are a charity. We have a database containing 100K donors who have not donated in the past 12 months. We know their basic demographics, address and how much they have donated in past (and when).
• Sending a Christmas post card asking for donation costs 60 cents/piece. When we mail out, the average donation (when there is one) comes at about 2 euros.
• Should we send a (postal) mail to all 100K donors?
• How can a classifier help me answer this question?
6
![Page 7: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/7.jpg)
Classification in Customer Lifecycle Management
Lead propensity Fault/Complaint prediction
Churn propensity
Cross-sell/up-sell propensity
Response modeling / buying
propensity
Win-back propensity
7
![Page 8: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/8.jpg)
event-basedsubscription-based
What is churn?
8
![Page 9: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/9.jpg)
Why care?
For most companies, the customer acquisition cost (cost of acquiring a new customer) is multiple times higher than the cost
of retaining an existing customer
Bringing a new customer to the level of profitability as an old customer takes time
Churners/switchers often can be avoided with proactive measures
9
![Page 10: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/10.jpg)
How to decrease churn?
Long-termTactical intervention
Short-termOperational intervention
10
![Page 11: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/11.jpg)
How to decrease short-term churn?
Recurrent payments (automatic subscription renewal)
Reminder emails (e.g. account expiration)
Providing extra services and knowledge
(e.g. figure out where clients have troubles and provide help with that)
Use scoring model to determine the intervention type and moment
Use scoring model to determine the intervention type and moment
11
![Page 12: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/12.jpg)
is not regression!*
Logistic regression
it is classification
* it is called so due to its similarity to linear regression12
![Page 13: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/13.jpg)
means that y is binary: (0,1)
Binary logistic regression
13
![Page 14: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/14.jpg)
Binary logistic regression
• sigmoid means S-shaped
• also known as squashing function since it maps the line to [0,1],
• which is necessary if the output needs to be interpreted as probability
14
![Page 15: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/15.jpg)
Binary logistic regression
“0”
“1”
If we threshold the output at 0.5, we create
a decision rule of the form
15
![Page 16: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/16.jpg)
logit <- glm(data=train, as.factor(danger)~waiting, family='binomial')test$predictions_probability <- predict(newdata=test, logit, type = 'response')test$predictions_binary <- ifelse(test$predictions_probability<=0.5, 0, 1)table(real=test$danger, predictions=test$predictions_binary)
predicted valuereal 0 1
0 34 01 2 64
Binary logistic regression: example in R
16
![Page 17: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/17.jpg)
Binary logistic regression: example in R
logit <- glm(data=train, as.factor(danger)~waiting, family='binomial')summary(logit)
Coefficients:Estimate Std. Error z value Pr(>|z|)
(Intercept) -42.5845 11.5448 -3.689 0.000225 ***waiting 0.6380 0.1728 3.692 0.000223 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Coefficients are difficult to interpret directly: they are logarithm of odds
We can take exponent of coefficients: exp(coef(logit))
(Intercept) waiting "0.000" "1.893"
17
An increase of waiting by
one unit increases the odds
of “danger” by 1.893
![Page 18: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/18.jpg)
Outline
• Classification and applications to CLM
• Classification with logistic regression
• Classification with decision trees
• Black-box classification models
• Measuring the quality of classification models
• Pitfalls: class imbalance & overfitting
18
![Page 19: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/19.jpg)
Decision Trees
Minivan
Age
Car Type
YES NO
YES
<30 >=30
Sports, Truck
0 30 60 Age
YES
YES
NO
Minivan
Sports,Truck
![Page 20: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/20.jpg)
Decision Tree: Split Selection
• Numerical or ordered attributes: Find a split point that separates the (two) classes
(Yes: No: )
30 35
Age
![Page 21: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/21.jpg)
Decision Tree: Split Selection
Categorical attributes: How to group?
Sport: Truck: Minivan:
(Sport, Truck): (Minivan):
(Sport): (Truck, Minivan):
(Sport, Minivan): (Truck)
Which split should we pick?Hint: the one with the higher “purity” impurity measure
![Page 22: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/22.jpg)
Decision Tree
Advantages
1. Easy to Understand
2. Useful in Data exploration
3. Very fast to calculate (even for large datasets)
Disadvantages
1. Low accuracy (like logistic regression)
2. Over fitting (more on this later)
22
![Page 23: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/23.jpg)
Input
Output
Input
Output
White-Box vs. Black-Box Models
Logistic regression
Decision tree
Ensemble methods:
Random Forests
Gradient boosting (e.g. XGBoost)
23
![Page 24: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/24.jpg)
Ensemble methods• Idea: Train multiple classifiers on subsets of the
data or subsets of features and have them compete
• Examples:
• Random forests
24
![Page 25: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/25.jpg)
Random forest
25
3 key ideas:1. Tree Bagging: each tree built with a random subset of data2. Each tree is built with a random subset of the features3. Voting
![Page 26: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/26.jpg)
Ensemble methods• Idea: Train multiple classifiers on subsets of the
data or subsets of features and have them compete
• Examples:
• Random forests
• XGBoost, LightGBM (state of the art)
• Why is it powerful?
• Because we can play around with the parameters, e.g. number of trees, depth, etc.
This is called “hyperparameter optimization”
26
![Page 27: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/27.jpg)
Outline
• Classification and applications to CLM
• Classification with logistic regression
• Classification with decision trees
• Black-box classification models
• Measuring the quality of classification models
• Pitfalls: class imbalance & overfitting
27
![Page 28: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/28.jpg)
Quality of Classifiers
28
![Page 29: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/29.jpg)
Quality of classifiers
29
![Page 30: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/30.jpg)
Calculate a confusion matrix
It depends on the class probability threshold!It depends on the class probability threshold!
Quality of Classifiers
30
Actual classProbability of positive classFeatures
![Page 31: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/31.jpg)
Threshold= 0.5i.e., if prediction >= 0.5
then churn (1)else no churn (0)
Threshold= 0.5i.e., if prediction >= 0.5
then churn (1)else no churn (0)
Quality of Classifiers
Actual/Predicted 1 (pos) 0 (neg)
1 (pos) 2 1
0 (neg) 0 2
31
Actual classProbability of positive classFeatures
![Page 32: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/32.jpg)
confusion matrix
Quality of Classifiers
Actual/Predicted 1 (P) 0 (N)
1 (P) 2 1
0 (N) 0 2
Precision = 2/3
2 3
32
Recall = 2/2
32
Actual classProbability of positive classFeatures
![Page 33: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/33.jpg)
Threshold= 0.8i.e., if prediction >= 0.8
then churn (1)else no churn (0)
Threshold= 0.8i.e., if prediction >= 0.8
then churn (1)else no churn (0)
Quality of Classifiers
Actual/Predicted 1 (pos) 0 (neg)
1 (pos) 2 0
0 (neg) 0 3
33
Actual classProbability of positive classFeatures
Now: what is the precision and the recall?
![Page 34: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/34.jpg)
Quality of Classifiers
• Accuracy: Ratio of correctly predicted observation to the total observations
• Accuracy = (TP+TN)/(TP+FP+FN+TN)
• Intuitive, but terrible in case of class imbalance
• F Score: Weighted average of Precision and Recall.
• F-Score = 2*(Recall * Precision) / (Recall + Precision)
• More resilient to class imbalance
34
![Page 35: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/35.jpg)
Outline
• Classification and applications to CLM
• Classification with logistic regression
• Classification with decision trees
• Black-box classification models
• Measuring the quality of classification models
• Pitfalls: class imbalance & overfitting
35
![Page 36: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/36.jpg)
Class imbalance
A credit card company collects data about OK vs fraudulent transactions
OK transactions make up 99.999% of transactions
We train a classifier to classify transactions into OK and fraudulent
We get an accuracy of 99.999%
Why?
![Page 37: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/37.jpg)
Class imbalance
always predict
not-fraud
99.997% accuracy
99.997% precision
0 recall!
Why does the classifier learn this?
Because twisting the classifier to catch the bad guys, makes us also “catch” some good guys
… so no visible improvement
Slide by: David Kauchak
![Page 38: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/38.jpg)
Dealing with class imbalance
38
• Use a quality measure that is as robust as possible
• F-Score is OK in extreme cases (like the previous one)
• Less robust in more subtle cases, e.g. 90-10 class imbalance
• ROC and AUC (next) are better…
• Area Under the Precision Recall Curve even better (AUPRC)
![Page 39: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/39.jpg)
ROC Curve and AUC: Another metric robust against imbalanced dataset
ROC – Robust Measurement of Classifier Quality
True Positive Rate (TPR) = Sensitivity =
False Positive Rate (FPR) = fp/(fp + tn)
Let’s plot the ROC curve for the following example
TP
R
FPR
01
1
0.5
0.25
0.75
0.25 0.5 0.75
39
![Page 40: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/40.jpg)
random guess
both our model, and the second one predict correctly
when they are confident in their scores
model1: Area under the curve: 0.9991
model2: Area under the curve: 0.8365
model2 makes
mistakes earlier
Intuition behind ROC
40
![Page 41: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/41.jpg)
random guess
both our model, and the second one predict correctly
when they are confident in their scores
model1: Area under the curve: 0.9991
model2: Area under the curve: 0.8365
model2 makes
mistakes earlier
Intuition behind ROC
AUCAUC
41
![Page 42: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/42.jpg)
Dealing within class imbalance
• Use a quality measure that is as robust as possible
• F-Score is OK in extreme cases (like the previous one)
• Less robust in more subtle cases, e.g. 90-10 class imbalance
• ROC and AUC (next) are better…
• Area Under the Precision Recall Curve even better (AUPRC)
• Undersampling
• Deliberately throw away training data
• E.g. we have 6000 positive samples, 500 negative for training
• We randomly select 500 of the 6000 positive samples
• Oversampling
• Synthetic generation of minority class data, e.g. SMOTE42
![Page 43: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/43.jpg)
http://www.inf.ed.ac.uk/teaching/courses/iaml/slides/eval-2x2.pdf
Under- and overfitting
43
![Page 44: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/44.jpg)
How to detect overfitting
• Compute AUC (or F-score) on the training set and on the testing set
• The larger the diff, the more likely overfitting has taken place
• E.g. AUC on training = 0.97, AUC on testing = 0.85
![Page 45: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/45.jpg)
Detecting overfitting (more rigorously)
Source: http://karlrosaen.com/ml/learning-log/2016-06-20/
K-Folds cross validation
Step1: We split our data into K different splits.
Step2: We use (K-1) folds for training and Kth for testing.
Step3: We do the step 2 for every different K set.
Step4: Finally we do average of the model quality across all the models.
The measure you get is more reflective of the likely performance in practice…
E = Output (Error or Accuracy) of your model
45
![Page 46: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/46.jpg)
Practicetime!
https://courses.cs.ut.ee/2018/bda/fall/Main/Practice
46
![Page 47: Lecture 5: Customer Lifecycle Management – Classification ... · Classification Problems MTAT.03.319 The slides are available under creative common license. The original owner of](https://reader035.vdocuments.mx/reader035/viewer/2022081402/6053c97c1ce76e394b03e9c4/html5/thumbnails/47.jpg)
In case you need more…
TheoryLogistic regression: https://www.youtube.com/watch?v=Z5WKQr4H4XkDecision Trees: https://www.youtube.com/watch?v=opQ49Xr748kRandom Forest: https://www.youtube.com/watch?v=gmmV4drPTS4
PracticalLogistic regression: https://www.youtube.com/watch?v=nubin7hq4-sDecision Tree: https://www.youtube.com/watch?v=JFJIQ0_2ijg&t=1026sRandom Forest: https://www.youtube.com/watch?v=nVMw7fTlj4o
47