association rule mining - maxminer. mining association rules in large databases association rule...

17
Association Rule Mining - MaxMiner

Post on 21-Dec-2015

255 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Association Rule Mining

- MaxMiner

Page 2: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Mining Association Rules in Large Databases

Association rule mining

Algorithms Apriori and FP-Growth

Max and closed patterns

Mining various kinds of association/correlation

rules

Page 3: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Max-patterns & Close-patterns If there are frequent patterns with many

items, enumerating all of them is costly. We may be interested in finding the

‘boundary’ frequent patterns. Two types…

Page 4: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Max-patterns Frequent pattern {a1, …, a100} (100

1) + (100

2) + … + (11

00

00) = 2100-1 = 1.27*1030

frequent sub-patterns! Max-pattern: frequent patterns without

proper frequent super pattern BCDE, ACD are max-patterns BCD is not a max-pattern

Tid Items

10 A,B,C,D,E

20 B,C,D,E,

30 A,C,D,FMin_sup=2

Page 5: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Maximal Frequent Itemset

null

AB AC AD AE BC BD BE CD CE DE

A B C D E

ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE

ABCD ABCE ABDE ACDE BCDE

ABCDE

Border

Infrequent Itemsets

Maximal Itemsets

An itemset is maximal frequent if none of its immediate supersets is frequent

Page 6: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Closed Itemset An itemset is closed if none of its immediate

supersets has the same support as the itemset

TID Items1 {A,B}2 {B,C,D}3 {A,B,C,D}4 {A,B,D}5 {A,B,C,D}

Itemset Support{A} 4{B} 5{C} 3{D} 4{A,B} 4{A,C} 2{A,D} 3{B,C} 3{B,D} 4{C,D} 3

Itemset Support{A,B,C} 2{A,B,D} 3{A,C,D} 2{B,C,D} 3{A,B,C,D} 2

Page 7: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Maximal vs Closed Itemsets

TID Items

1 ABC

2 ABCD

3 BCE

4 ACDE

5 DE

null

AB AC AD AE BC BD BE CD CE DE

A B C D E

ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE

ABCD ABCE ABDE ACDE BCDE

ABCDE

124 123 1234 245 345

12 124 24 4 123 2 3 24 34 45

12 2 24 4 4 2 3 4

2 4

Transaction Ids

Not supported by any transactions

Page 8: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Maximal vs Closed Frequent Itemsets

null

AB AC AD AE BC BD BE CD CE DE

A B C D E

ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE

ABCD ABCE ABDE ACDE BCDE

ABCDE

124 123 1234 245 345

12 124 24 4 123 2 3 24 34 45

12 2 24 4 4 2 3 4

2 4

Minimum support = 2

# Closed = 9

# Maximal = 4

Closed and maximal

Closed but not maximal

Page 9: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Maximal vs Closed Itemsets

FrequentItemsets

ClosedFrequentItemsets

MaximalFrequentItemsets

Page 10: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

MaxMiner: Mining Max-patterns

Idea: generate the complete set-enumeration tree one level at a time, while prune if applicable.

(ABCD)

A (BCD) B (CD) C (D) D ()

AB (CD) AC (D) AD () BC (D) BD () CD ()

ABC (C)

ABCD ()

ABD () ACD () BCD ()

Page 11: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Local Pruning Techniques (e.g. at node A)

Check the frequency of ABCD and AB, AC, AD. If ABCD is frequent, prune the whole sub-tree. If AC is NOT frequent, remove C from the

parenthesis before expanding.

(ABCD)

A (BCD) B (CD) C (D) D ()

AB (CD) AC (D) AD () BC (D) BD () CD ()

ABC (C)

ABCD ()

ABD () ACD () BCD ()

Page 12: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Algorithm MaxMiner

Initially, generate one node N= , where h(N)= and t(N)={A,B,C,D}.

Consider expanding N, If h(N)t(N) is frequent, do not expand N. If for some it(N), h(N){i} is NOT frequent,

remove i from t(N) before expanding N. Apply global pruning techniques…

(ABCD)

Page 13: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Global Pruning Technique (across sub-trees)

When a max pattern is identified (e.g. ABCD), prune all nodes (e.g. B, C and D) where h(N)t(N) is a sub-set of it (e.g. ABCD).

(ABCD)

A (BCD) B (CD) C (D) D ()

AB (CD) AC (D) AD () BC (D) BD () CD ()

ABC (C)

ABCD ()

ABD () ACD () BCD ()

Page 14: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Example

Tid Items

10 A,B,C,D,E

20 B,C,D,E,

30 A,C,D,F

(ABCDEF)

Items Frequency

ABCDEF 0

A 2

B 2

C 3

D 3

E 2

F 1

Min_sup=2

Max patterns:

A (BCDE)B (CDE) C (DE) E ()D (E)

Page 15: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Example

Tid Items

10 A,B,C,D,E

20 B,C,D,E,

30 A,C,D,F

(ABCDEF)

Items Frequency

ABCDE 1

AB 1

AC 2

AD 2

AE 1

Min_sup=2

A (BCDE)B (CDE) C (DE) E ()D (E)

AC (D) AD ()

Max patterns:

Node A

Page 16: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Example

Tid Items

10 A,B,C,D,E

20 B,C,D,E,

30 A,C,D,F

(ABCDEF)

Items Frequency

BCDE 2

BC

BD

BE

Min_sup=2

A (BCDE)B (CDE) C (DE) E ()D (E)

AC (D) AD ()

Max patterns:

BCDE

Node B

Page 17: Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and

Example

Tid Items

10 A,B,C,D,E

20 B,C,D,E,

30 A,C,D,F

(ABCDEF)

Items Frequency

ACD 2

Min_sup=2

A (BCDE)B (CDE) C (DE) E ()D (E)

AC (D) AD ()

Max patterns:

BCDEACD

Node AC