decision tree and random forest

Decision Tree & Random Forest Algorithm

Outline

Introduction Example of Decision Tree Principles of Decision Tree

– Entropy– Information gain

Random Forest

2

The problem

Given a set of training cases/objects and their attribute values, try to determine the target attribute value of new examples.

– Classification– Prediction

Apply Model

Induction

Deduction

Learn Model

Model

Tid Attrib1 Attrib2 Attrib3 Class

1 Yes Large 125K No

2 No Medium 100K No

3 No Small 70K No

4 Yes Medium 120K No

5 No Large 95K Yes

6 No Medium 60K No

7 Yes Large 220K No

8 No Small 85K Yes

9 No Medium 75K No

10 No Small 90K Yes 10

Tid Attrib1 Attrib2 Attrib3 Class

11 No Small 55K ?

12 Yes Medium 80K ?

13 Yes Large 110K ?

14 No Small 95K ?

15 No Large 67K ? 10

Test Set

Learningalgorithm

Training Set

3

Key Requirements

Attribute-value description: object or case must be expressible in terms of a fixed collection of properties or attributes (e.g., hot, mild, cold).

Predefined classes (target values): the target function has discrete output values (bollean or multiclass)

Sufficient data: enough training cases should be provided to learn the model.

4

A simple example

5

Principled Criterion

Choosing the most useful attribute for classifying examples. Entropy

- A measure of homogeneity of the set of examples- If the sample is completely homogeneous the entropy is zero and if

the sample is an equally divided it has entropy of one Information Gain

- Measures how well a given attribute separates the training examples according to their target classification

- This measure is used to select among the candidate attributes at each step while growing the tree

6

Information Gain

Step 1 : Calculate entropy of the target

7

Information Gain (Cont’d)

Step 2 : Calculate information gain for each attribute

8


Step 3: Choose attribute with the largest information gain as the decision node.

9


Step 4a: A branch with entropy of 0 is a leaf node.

10


Step 4b: A branch with entropy more than 0 needs further splitting.

11


Step 5: The algorithm is run recursively on the non-leaf branches, until all data is classified.

12

Random Forest

Decision Tree : one tree Random Forest : more than one tree

13

Decision Tree & Random Forest

14

Decision Tree

Random Forest

Tree 1 Tree 2

Tree 3

Decision Tree

Outlook Temp. Humidity Windy Play GolfRainy Mild High False ?

15

Result : No

Random Forest

16

Tree 1 Tree 2

Tree 3

Tree 1 : NoTree 2 : NoTree 3 : Yes

Yes : 1No : 2

Result : No

OOB Error Rate

OOB error rate can be used to get a running unbiased estimate of the classification error as trees are added to the forest.

17

decision tree and random forest

Data & Analytics