business intelligence

25
BUSINESS INTELLIGENCE & DATA MINING Association Rules- Support; Confidence; Lift; Conviction. 1 Geo S. Mariyan (Master of Computer Science) University of Mumbai.

Upload: geo-marian

Post on 21-Mar-2017

12 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Business intelligence

1

BUSINESS INTELLIGENCE &

DATA MINING Association Rules- Support; Confidence; Lift; Conviction.

Geo S. Mariyan(Master of Computer Science)

University of Mumbai.

Page 2: Business intelligence

2

2. Association Rules: Introduction Introduction : - Association rules are created by analyzing data for frequent if/then patterns and using the criteria support and confidence to identify the most important relationships.

- The purchase of one product when another product is purchased represents an association rules.

- Association rules are used in retail stores in marketing advertising floor placements and inventory control.

Page 3: Business intelligence

3

Support is an indication of how frequently the items appear in the database. Confidence indicates the number of times the if/then statements have been found to be true.

In data mining, association rules are useful for analyzing and predicting customer behavior.

They play an important part in shopping basket data analysis, product clustering, catalog design and store layout.

Example:  "If a customer buys a dozen eggs, he is 80% likely to also purchase milk."

Page 4: Business intelligence

4

Frequent Item Set The most common approach to finding association rule is to break

up problem into two parts:1)Find Large Itemsets.2)Generate rules from frequent itemsets. A set is called frequent if its support is no less than a given absolute minimal support threshold. An itemset is any subset of the set of all items. A frequent itemset is an itemset whose number of occurrence is

above a threshold. We use the notation L to indicate the complete set of large item sets and l to indicate a specific large itemsets.

The original motivation for searching frequent sets came from the need to analyse so called supermarket transaction data, that is, to examine customer behaviour in terms of the purchased products.

Page 5: Business intelligence

5

Page 6: Business intelligence

6

Page 7: Business intelligence

7

Association Rule Notation

Page 8: Business intelligence

8

Association Rule Definitions

Page 9: Business intelligence

9

Association Rule Example:

Page 10: Business intelligence

10

Algorithm to Generate Association Rules:In this algorithm we use a function support , which

returns the support for the input itemset.

Page 11: Business intelligence

11

ExampleTable1.1 Sample data to Illustrate Association Rule A database in which an association rule is to be found in viewed as a set of tuples , where each tuple contains a set of items. For example, a tuple could be {Bread,Jelly, Peanut,Butter} which consists of this three item. Table 1.1 is used throughout this topic to illustrate different algorithms. Here, there are five transaction and five items.

Page 12: Business intelligence

12

Support (s)

Page 13: Business intelligence

13

Support of All Sets of ItemsSupport: This says how popular an itemset is, as measured by the proportion of transactions in which an itemset appears. In Table 1 below, the support of {apple} is 4 out of 8, or 50%. Itemsets can also contain multiple items. For instance, the support of {apple, beer, rice} is 2 out of 8, or 25%.

Page 14: Business intelligence

14

Confidence

Every association rule has a support and a confidence.

An association rule is of the form: X => Y

X => Y: if someone buys X, he also buys Y

The confidence is the conditional probability that, given X present in a transition , Y will also be present.

Confidence measure, by definition: Confidence(X=>Y) equals support(X,Y) / support(X)

Page 15: Business intelligence

15

Support and Confidence for Some association Rule

This says how likely item Y is purchased when item X is purchased, expressed as {X -> Y}.

This is measured by the proportion of transactions with item X, in which item Y also appears.

Page 16: Business intelligence

16

Lift•  

Page 17: Business intelligence

17

This says how likely item Y is purchased when item X is purchased, while controlling for how popular item Y is. In Table 1, the lift of {apple -> beer} is 1, which implies no. association between items.

A lift value greater than 1 means that item Y is likely to be bought if item X is bought, while a value less than 1 means that item Y is unlikely to be bought if item X is bought.

Page 18: Business intelligence

18

Conviction•  

Page 19: Business intelligence

19

•  

Page 20: Business intelligence

20

Page 21: Business intelligence

21

Confidence or Support :

The confidence or Support (a) for a association rule X=> Y is the ratio of the number of transaction that contain X U Y to the transaction that contain X.

The selection of association rule is based on these two values as describe in the definition of the association rule problem in definition.

Confidence measure the Support of the rule where as supports measures how often it should occur in the database.

Page 22: Business intelligence

22

Page 23: Business intelligence

23

Page 24: Business intelligence

24

Page 25: Business intelligence

25