market basket analysis-2011
DESCRIPTION
Market Basket Analysis-2011TRANSCRIPT
Market Basket Analysis and Association Rules
2
What can be inferred
I purchase diapersI purchase diapers
I purchase a new carI purchase a new car
I purchase OTC cough medicineI purchase OTC cough medicine
I purchase a prescription I purchase a prescription
medicationmedication
I donrsquot show up for classI donrsquot show up for class
3
What are Association Rules
Study of ldquowhat goes with whatrdquoStudy of ldquowhat goes with whatrdquo ldquoldquoCustomers who bought X also bought YrdquoCustomers who bought X also bought Yrdquo What symptoms go with what diagnosisWhat symptoms go with what diagnosis
Transaction-based or event-basedTransaction-based or event-based Also called ldquomarket basket analysisrdquo Also called ldquomarket basket analysisrdquo
and ldquoaffinity analysisrdquoand ldquoaffinity analysisrdquo Originated with study of customer Originated with study of customer
transactions databases to determine transactions databases to determine associations among items purchasedassociations among items purchased
4
What is Market Basket Analysis
Understanding behavior of shoppersUnderstanding behavior of shoppers What items are bought togetherWhat items are bought together
Whatrsquos in each shopping cartbasketWhatrsquos in each shopping cartbasket
Basket data consist of collection of transaction Basket data consist of collection of transaction date and items bought in a transactiondate and items bought in a transaction ItemsetItemset
How does this data differ from a transaction How does this data differ from a transaction databasedatabase PivotingPivoting
Retail organizations interested in generating Retail organizations interested in generating qualified decisions and strategy based on analysis qualified decisions and strategy based on analysis of transaction data of transaction data what to put on sale how to place merchandise on shelves for what to put on sale how to place merchandise on shelves for
maximizing profit customer segmentation based on buying maximizing profit customer segmentation based on buying patternpattern
5
Examples
Rule form LHS Rule form LHS RHSRHS IF a customer buys diapers THEN they also buy IF a customer buys diapers THEN they also buy
beerbeer diapers diapers beer beer
ldquoldquoTransactions that purchase bread and butter Transactions that purchase bread and butter also purchase milkrdquoalso purchase milkrdquo
bread bread butter butter milk milk Customers who purchase maintenance Customers who purchase maintenance
agreements are very likely to purchase large agreements are very likely to purchase large appliances appliances
When a new hardware store opens one of the When a new hardware store opens one of the most commonly sold items is toilet bowl cleanersmost commonly sold items is toilet bowl cleaners
6
Evaluation Support Support measure of how often the collection of measure of how often the collection of
items in an association occur together as a items in an association occur together as a percentage of all the transactionspercentage of all the transactions In 2 of the purchases at hardware store both pick and In 2 of the purchases at hardware store both pick and
shovel were boughtshovel were bought support = tuples(LHS RHS)Nsupport = tuples(LHS RHS)N
Confidence Confidence confidence of rule ldquoB given Ardquo is a confidence of rule ldquoB given Ardquo is a measure of how much more likely it is that B measure of how much more likely it is that B occurs when A has occurred occurs when A has occurred 100 meaning that B always occurs if A has occurred100 meaning that B always occurs if A has occurred confidence = tuples(LHS RHS) tuples(LHS)confidence = tuples(LHS RHS) tuples(LHS) Example bread and butter Example bread and butter milk [90 1] milk [90 1]
Rules originating from the same itemset have Rules originating from the same itemset have identical support but can have different identical support but can have different confidenceconfidence
7
The association rules mining problem
Generate all association rules from Generate all association rules from the given dataset that have the given dataset that have
support greater than a specified support greater than a specified minimumminimum
and and confidence greater than a specified confidence greater than a specified
minimumminimum
8
Examples
Rule form Rule form LHS LHS RHS [confidence RHS [confidence
support]support] diapers diapers beer [60 05] beer [60 05]
ldquoldquo90 of transactions that purchase 90 of transactions that purchase bread and butter also purchase milkrdquobread and butter also purchase milkrdquo
bread and butter bread and butter milk [90 1] milk [90 1]
9
Example
Tr Items
T1 Beer Milk
T2 Bread Butter
T3 Bread Butter Jelly
T4 Bread Butter Milk
T5 Beer Bread
Itemset Support
Bread 80
Butter 60
Milk 40
Beer 40
Bread Butter 60
Large Itemsets with minsup=30
Consider the itemset
Bread Butter and the two possible rules
Bread Butter
Butter Bread
Support(Bread Butter)support(Bread = 75
ie Confidence(Bread Butter) = 75
Support(Bread Butter)support(Butter = 1
ie Confidence(Butter Bread) = 100
10
How Good is an Association Rule
Is support and confidence enoughIs support and confidence enough Lift (improvement) tells us how much better a Lift (improvement) tells us how much better a
rule is at predicting the result than just assuming rule is at predicting the result than just assuming the result in the first placethe result in the first place Lift = P(LHS^RHS) (P(LHS)P(RHS) Lift = P(LHS^RHS) (P(LHS)P(RHS)
When lift gt 1 then the rule is better at predicting When lift gt 1 then the rule is better at predicting the result than guessingthe result than guessing
When lift lt 1 the rule is doing worse than When lift lt 1 the rule is doing worse than informed guessing and using the informed guessing and using the Negative RuleNegative Rule produces a better rule than guessingproduces a better rule than guessing
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
2
What can be inferred
I purchase diapersI purchase diapers
I purchase a new carI purchase a new car
I purchase OTC cough medicineI purchase OTC cough medicine
I purchase a prescription I purchase a prescription
medicationmedication
I donrsquot show up for classI donrsquot show up for class
3
What are Association Rules
Study of ldquowhat goes with whatrdquoStudy of ldquowhat goes with whatrdquo ldquoldquoCustomers who bought X also bought YrdquoCustomers who bought X also bought Yrdquo What symptoms go with what diagnosisWhat symptoms go with what diagnosis
Transaction-based or event-basedTransaction-based or event-based Also called ldquomarket basket analysisrdquo Also called ldquomarket basket analysisrdquo
and ldquoaffinity analysisrdquoand ldquoaffinity analysisrdquo Originated with study of customer Originated with study of customer
transactions databases to determine transactions databases to determine associations among items purchasedassociations among items purchased
4
What is Market Basket Analysis
Understanding behavior of shoppersUnderstanding behavior of shoppers What items are bought togetherWhat items are bought together
Whatrsquos in each shopping cartbasketWhatrsquos in each shopping cartbasket
Basket data consist of collection of transaction Basket data consist of collection of transaction date and items bought in a transactiondate and items bought in a transaction ItemsetItemset
How does this data differ from a transaction How does this data differ from a transaction databasedatabase PivotingPivoting
Retail organizations interested in generating Retail organizations interested in generating qualified decisions and strategy based on analysis qualified decisions and strategy based on analysis of transaction data of transaction data what to put on sale how to place merchandise on shelves for what to put on sale how to place merchandise on shelves for
maximizing profit customer segmentation based on buying maximizing profit customer segmentation based on buying patternpattern
5
Examples
Rule form LHS Rule form LHS RHSRHS IF a customer buys diapers THEN they also buy IF a customer buys diapers THEN they also buy
beerbeer diapers diapers beer beer
ldquoldquoTransactions that purchase bread and butter Transactions that purchase bread and butter also purchase milkrdquoalso purchase milkrdquo
bread bread butter butter milk milk Customers who purchase maintenance Customers who purchase maintenance
agreements are very likely to purchase large agreements are very likely to purchase large appliances appliances
When a new hardware store opens one of the When a new hardware store opens one of the most commonly sold items is toilet bowl cleanersmost commonly sold items is toilet bowl cleaners
6
Evaluation Support Support measure of how often the collection of measure of how often the collection of
items in an association occur together as a items in an association occur together as a percentage of all the transactionspercentage of all the transactions In 2 of the purchases at hardware store both pick and In 2 of the purchases at hardware store both pick and
shovel were boughtshovel were bought support = tuples(LHS RHS)Nsupport = tuples(LHS RHS)N
Confidence Confidence confidence of rule ldquoB given Ardquo is a confidence of rule ldquoB given Ardquo is a measure of how much more likely it is that B measure of how much more likely it is that B occurs when A has occurred occurs when A has occurred 100 meaning that B always occurs if A has occurred100 meaning that B always occurs if A has occurred confidence = tuples(LHS RHS) tuples(LHS)confidence = tuples(LHS RHS) tuples(LHS) Example bread and butter Example bread and butter milk [90 1] milk [90 1]
Rules originating from the same itemset have Rules originating from the same itemset have identical support but can have different identical support but can have different confidenceconfidence
7
The association rules mining problem
Generate all association rules from Generate all association rules from the given dataset that have the given dataset that have
support greater than a specified support greater than a specified minimumminimum
and and confidence greater than a specified confidence greater than a specified
minimumminimum
8
Examples
Rule form Rule form LHS LHS RHS [confidence RHS [confidence
support]support] diapers diapers beer [60 05] beer [60 05]
ldquoldquo90 of transactions that purchase 90 of transactions that purchase bread and butter also purchase milkrdquobread and butter also purchase milkrdquo
bread and butter bread and butter milk [90 1] milk [90 1]
9
Example
Tr Items
T1 Beer Milk
T2 Bread Butter
T3 Bread Butter Jelly
T4 Bread Butter Milk
T5 Beer Bread
Itemset Support
Bread 80
Butter 60
Milk 40
Beer 40
Bread Butter 60
Large Itemsets with minsup=30
Consider the itemset
Bread Butter and the two possible rules
Bread Butter
Butter Bread
Support(Bread Butter)support(Bread = 75
ie Confidence(Bread Butter) = 75
Support(Bread Butter)support(Butter = 1
ie Confidence(Butter Bread) = 100
10
How Good is an Association Rule
Is support and confidence enoughIs support and confidence enough Lift (improvement) tells us how much better a Lift (improvement) tells us how much better a
rule is at predicting the result than just assuming rule is at predicting the result than just assuming the result in the first placethe result in the first place Lift = P(LHS^RHS) (P(LHS)P(RHS) Lift = P(LHS^RHS) (P(LHS)P(RHS)
When lift gt 1 then the rule is better at predicting When lift gt 1 then the rule is better at predicting the result than guessingthe result than guessing
When lift lt 1 the rule is doing worse than When lift lt 1 the rule is doing worse than informed guessing and using the informed guessing and using the Negative RuleNegative Rule produces a better rule than guessingproduces a better rule than guessing
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
3
What are Association Rules
Study of ldquowhat goes with whatrdquoStudy of ldquowhat goes with whatrdquo ldquoldquoCustomers who bought X also bought YrdquoCustomers who bought X also bought Yrdquo What symptoms go with what diagnosisWhat symptoms go with what diagnosis
Transaction-based or event-basedTransaction-based or event-based Also called ldquomarket basket analysisrdquo Also called ldquomarket basket analysisrdquo
and ldquoaffinity analysisrdquoand ldquoaffinity analysisrdquo Originated with study of customer Originated with study of customer
transactions databases to determine transactions databases to determine associations among items purchasedassociations among items purchased
4
What is Market Basket Analysis
Understanding behavior of shoppersUnderstanding behavior of shoppers What items are bought togetherWhat items are bought together
Whatrsquos in each shopping cartbasketWhatrsquos in each shopping cartbasket
Basket data consist of collection of transaction Basket data consist of collection of transaction date and items bought in a transactiondate and items bought in a transaction ItemsetItemset
How does this data differ from a transaction How does this data differ from a transaction databasedatabase PivotingPivoting
Retail organizations interested in generating Retail organizations interested in generating qualified decisions and strategy based on analysis qualified decisions and strategy based on analysis of transaction data of transaction data what to put on sale how to place merchandise on shelves for what to put on sale how to place merchandise on shelves for
maximizing profit customer segmentation based on buying maximizing profit customer segmentation based on buying patternpattern
5
Examples
Rule form LHS Rule form LHS RHSRHS IF a customer buys diapers THEN they also buy IF a customer buys diapers THEN they also buy
beerbeer diapers diapers beer beer
ldquoldquoTransactions that purchase bread and butter Transactions that purchase bread and butter also purchase milkrdquoalso purchase milkrdquo
bread bread butter butter milk milk Customers who purchase maintenance Customers who purchase maintenance
agreements are very likely to purchase large agreements are very likely to purchase large appliances appliances
When a new hardware store opens one of the When a new hardware store opens one of the most commonly sold items is toilet bowl cleanersmost commonly sold items is toilet bowl cleaners
6
Evaluation Support Support measure of how often the collection of measure of how often the collection of
items in an association occur together as a items in an association occur together as a percentage of all the transactionspercentage of all the transactions In 2 of the purchases at hardware store both pick and In 2 of the purchases at hardware store both pick and
shovel were boughtshovel were bought support = tuples(LHS RHS)Nsupport = tuples(LHS RHS)N
Confidence Confidence confidence of rule ldquoB given Ardquo is a confidence of rule ldquoB given Ardquo is a measure of how much more likely it is that B measure of how much more likely it is that B occurs when A has occurred occurs when A has occurred 100 meaning that B always occurs if A has occurred100 meaning that B always occurs if A has occurred confidence = tuples(LHS RHS) tuples(LHS)confidence = tuples(LHS RHS) tuples(LHS) Example bread and butter Example bread and butter milk [90 1] milk [90 1]
Rules originating from the same itemset have Rules originating from the same itemset have identical support but can have different identical support but can have different confidenceconfidence
7
The association rules mining problem
Generate all association rules from Generate all association rules from the given dataset that have the given dataset that have
support greater than a specified support greater than a specified minimumminimum
and and confidence greater than a specified confidence greater than a specified
minimumminimum
8
Examples
Rule form Rule form LHS LHS RHS [confidence RHS [confidence
support]support] diapers diapers beer [60 05] beer [60 05]
ldquoldquo90 of transactions that purchase 90 of transactions that purchase bread and butter also purchase milkrdquobread and butter also purchase milkrdquo
bread and butter bread and butter milk [90 1] milk [90 1]
9
Example
Tr Items
T1 Beer Milk
T2 Bread Butter
T3 Bread Butter Jelly
T4 Bread Butter Milk
T5 Beer Bread
Itemset Support
Bread 80
Butter 60
Milk 40
Beer 40
Bread Butter 60
Large Itemsets with minsup=30
Consider the itemset
Bread Butter and the two possible rules
Bread Butter
Butter Bread
Support(Bread Butter)support(Bread = 75
ie Confidence(Bread Butter) = 75
Support(Bread Butter)support(Butter = 1
ie Confidence(Butter Bread) = 100
10
How Good is an Association Rule
Is support and confidence enoughIs support and confidence enough Lift (improvement) tells us how much better a Lift (improvement) tells us how much better a
rule is at predicting the result than just assuming rule is at predicting the result than just assuming the result in the first placethe result in the first place Lift = P(LHS^RHS) (P(LHS)P(RHS) Lift = P(LHS^RHS) (P(LHS)P(RHS)
When lift gt 1 then the rule is better at predicting When lift gt 1 then the rule is better at predicting the result than guessingthe result than guessing
When lift lt 1 the rule is doing worse than When lift lt 1 the rule is doing worse than informed guessing and using the informed guessing and using the Negative RuleNegative Rule produces a better rule than guessingproduces a better rule than guessing
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
4
What is Market Basket Analysis
Understanding behavior of shoppersUnderstanding behavior of shoppers What items are bought togetherWhat items are bought together
Whatrsquos in each shopping cartbasketWhatrsquos in each shopping cartbasket
Basket data consist of collection of transaction Basket data consist of collection of transaction date and items bought in a transactiondate and items bought in a transaction ItemsetItemset
How does this data differ from a transaction How does this data differ from a transaction databasedatabase PivotingPivoting
Retail organizations interested in generating Retail organizations interested in generating qualified decisions and strategy based on analysis qualified decisions and strategy based on analysis of transaction data of transaction data what to put on sale how to place merchandise on shelves for what to put on sale how to place merchandise on shelves for
maximizing profit customer segmentation based on buying maximizing profit customer segmentation based on buying patternpattern
5
Examples
Rule form LHS Rule form LHS RHSRHS IF a customer buys diapers THEN they also buy IF a customer buys diapers THEN they also buy
beerbeer diapers diapers beer beer
ldquoldquoTransactions that purchase bread and butter Transactions that purchase bread and butter also purchase milkrdquoalso purchase milkrdquo
bread bread butter butter milk milk Customers who purchase maintenance Customers who purchase maintenance
agreements are very likely to purchase large agreements are very likely to purchase large appliances appliances
When a new hardware store opens one of the When a new hardware store opens one of the most commonly sold items is toilet bowl cleanersmost commonly sold items is toilet bowl cleaners
6
Evaluation Support Support measure of how often the collection of measure of how often the collection of
items in an association occur together as a items in an association occur together as a percentage of all the transactionspercentage of all the transactions In 2 of the purchases at hardware store both pick and In 2 of the purchases at hardware store both pick and
shovel were boughtshovel were bought support = tuples(LHS RHS)Nsupport = tuples(LHS RHS)N
Confidence Confidence confidence of rule ldquoB given Ardquo is a confidence of rule ldquoB given Ardquo is a measure of how much more likely it is that B measure of how much more likely it is that B occurs when A has occurred occurs when A has occurred 100 meaning that B always occurs if A has occurred100 meaning that B always occurs if A has occurred confidence = tuples(LHS RHS) tuples(LHS)confidence = tuples(LHS RHS) tuples(LHS) Example bread and butter Example bread and butter milk [90 1] milk [90 1]
Rules originating from the same itemset have Rules originating from the same itemset have identical support but can have different identical support but can have different confidenceconfidence
7
The association rules mining problem
Generate all association rules from Generate all association rules from the given dataset that have the given dataset that have
support greater than a specified support greater than a specified minimumminimum
and and confidence greater than a specified confidence greater than a specified
minimumminimum
8
Examples
Rule form Rule form LHS LHS RHS [confidence RHS [confidence
support]support] diapers diapers beer [60 05] beer [60 05]
ldquoldquo90 of transactions that purchase 90 of transactions that purchase bread and butter also purchase milkrdquobread and butter also purchase milkrdquo
bread and butter bread and butter milk [90 1] milk [90 1]
9
Example
Tr Items
T1 Beer Milk
T2 Bread Butter
T3 Bread Butter Jelly
T4 Bread Butter Milk
T5 Beer Bread
Itemset Support
Bread 80
Butter 60
Milk 40
Beer 40
Bread Butter 60
Large Itemsets with minsup=30
Consider the itemset
Bread Butter and the two possible rules
Bread Butter
Butter Bread
Support(Bread Butter)support(Bread = 75
ie Confidence(Bread Butter) = 75
Support(Bread Butter)support(Butter = 1
ie Confidence(Butter Bread) = 100
10
How Good is an Association Rule
Is support and confidence enoughIs support and confidence enough Lift (improvement) tells us how much better a Lift (improvement) tells us how much better a
rule is at predicting the result than just assuming rule is at predicting the result than just assuming the result in the first placethe result in the first place Lift = P(LHS^RHS) (P(LHS)P(RHS) Lift = P(LHS^RHS) (P(LHS)P(RHS)
When lift gt 1 then the rule is better at predicting When lift gt 1 then the rule is better at predicting the result than guessingthe result than guessing
When lift lt 1 the rule is doing worse than When lift lt 1 the rule is doing worse than informed guessing and using the informed guessing and using the Negative RuleNegative Rule produces a better rule than guessingproduces a better rule than guessing
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
5
Examples
Rule form LHS Rule form LHS RHSRHS IF a customer buys diapers THEN they also buy IF a customer buys diapers THEN they also buy
beerbeer diapers diapers beer beer
ldquoldquoTransactions that purchase bread and butter Transactions that purchase bread and butter also purchase milkrdquoalso purchase milkrdquo
bread bread butter butter milk milk Customers who purchase maintenance Customers who purchase maintenance
agreements are very likely to purchase large agreements are very likely to purchase large appliances appliances
When a new hardware store opens one of the When a new hardware store opens one of the most commonly sold items is toilet bowl cleanersmost commonly sold items is toilet bowl cleaners
6
Evaluation Support Support measure of how often the collection of measure of how often the collection of
items in an association occur together as a items in an association occur together as a percentage of all the transactionspercentage of all the transactions In 2 of the purchases at hardware store both pick and In 2 of the purchases at hardware store both pick and
shovel were boughtshovel were bought support = tuples(LHS RHS)Nsupport = tuples(LHS RHS)N
Confidence Confidence confidence of rule ldquoB given Ardquo is a confidence of rule ldquoB given Ardquo is a measure of how much more likely it is that B measure of how much more likely it is that B occurs when A has occurred occurs when A has occurred 100 meaning that B always occurs if A has occurred100 meaning that B always occurs if A has occurred confidence = tuples(LHS RHS) tuples(LHS)confidence = tuples(LHS RHS) tuples(LHS) Example bread and butter Example bread and butter milk [90 1] milk [90 1]
Rules originating from the same itemset have Rules originating from the same itemset have identical support but can have different identical support but can have different confidenceconfidence
7
The association rules mining problem
Generate all association rules from Generate all association rules from the given dataset that have the given dataset that have
support greater than a specified support greater than a specified minimumminimum
and and confidence greater than a specified confidence greater than a specified
minimumminimum
8
Examples
Rule form Rule form LHS LHS RHS [confidence RHS [confidence
support]support] diapers diapers beer [60 05] beer [60 05]
ldquoldquo90 of transactions that purchase 90 of transactions that purchase bread and butter also purchase milkrdquobread and butter also purchase milkrdquo
bread and butter bread and butter milk [90 1] milk [90 1]
9
Example
Tr Items
T1 Beer Milk
T2 Bread Butter
T3 Bread Butter Jelly
T4 Bread Butter Milk
T5 Beer Bread
Itemset Support
Bread 80
Butter 60
Milk 40
Beer 40
Bread Butter 60
Large Itemsets with minsup=30
Consider the itemset
Bread Butter and the two possible rules
Bread Butter
Butter Bread
Support(Bread Butter)support(Bread = 75
ie Confidence(Bread Butter) = 75
Support(Bread Butter)support(Butter = 1
ie Confidence(Butter Bread) = 100
10
How Good is an Association Rule
Is support and confidence enoughIs support and confidence enough Lift (improvement) tells us how much better a Lift (improvement) tells us how much better a
rule is at predicting the result than just assuming rule is at predicting the result than just assuming the result in the first placethe result in the first place Lift = P(LHS^RHS) (P(LHS)P(RHS) Lift = P(LHS^RHS) (P(LHS)P(RHS)
When lift gt 1 then the rule is better at predicting When lift gt 1 then the rule is better at predicting the result than guessingthe result than guessing
When lift lt 1 the rule is doing worse than When lift lt 1 the rule is doing worse than informed guessing and using the informed guessing and using the Negative RuleNegative Rule produces a better rule than guessingproduces a better rule than guessing
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
6
Evaluation Support Support measure of how often the collection of measure of how often the collection of
items in an association occur together as a items in an association occur together as a percentage of all the transactionspercentage of all the transactions In 2 of the purchases at hardware store both pick and In 2 of the purchases at hardware store both pick and
shovel were boughtshovel were bought support = tuples(LHS RHS)Nsupport = tuples(LHS RHS)N
Confidence Confidence confidence of rule ldquoB given Ardquo is a confidence of rule ldquoB given Ardquo is a measure of how much more likely it is that B measure of how much more likely it is that B occurs when A has occurred occurs when A has occurred 100 meaning that B always occurs if A has occurred100 meaning that B always occurs if A has occurred confidence = tuples(LHS RHS) tuples(LHS)confidence = tuples(LHS RHS) tuples(LHS) Example bread and butter Example bread and butter milk [90 1] milk [90 1]
Rules originating from the same itemset have Rules originating from the same itemset have identical support but can have different identical support but can have different confidenceconfidence
7
The association rules mining problem
Generate all association rules from Generate all association rules from the given dataset that have the given dataset that have
support greater than a specified support greater than a specified minimumminimum
and and confidence greater than a specified confidence greater than a specified
minimumminimum
8
Examples
Rule form Rule form LHS LHS RHS [confidence RHS [confidence
support]support] diapers diapers beer [60 05] beer [60 05]
ldquoldquo90 of transactions that purchase 90 of transactions that purchase bread and butter also purchase milkrdquobread and butter also purchase milkrdquo
bread and butter bread and butter milk [90 1] milk [90 1]
9
Example
Tr Items
T1 Beer Milk
T2 Bread Butter
T3 Bread Butter Jelly
T4 Bread Butter Milk
T5 Beer Bread
Itemset Support
Bread 80
Butter 60
Milk 40
Beer 40
Bread Butter 60
Large Itemsets with minsup=30
Consider the itemset
Bread Butter and the two possible rules
Bread Butter
Butter Bread
Support(Bread Butter)support(Bread = 75
ie Confidence(Bread Butter) = 75
Support(Bread Butter)support(Butter = 1
ie Confidence(Butter Bread) = 100
10
How Good is an Association Rule
Is support and confidence enoughIs support and confidence enough Lift (improvement) tells us how much better a Lift (improvement) tells us how much better a
rule is at predicting the result than just assuming rule is at predicting the result than just assuming the result in the first placethe result in the first place Lift = P(LHS^RHS) (P(LHS)P(RHS) Lift = P(LHS^RHS) (P(LHS)P(RHS)
When lift gt 1 then the rule is better at predicting When lift gt 1 then the rule is better at predicting the result than guessingthe result than guessing
When lift lt 1 the rule is doing worse than When lift lt 1 the rule is doing worse than informed guessing and using the informed guessing and using the Negative RuleNegative Rule produces a better rule than guessingproduces a better rule than guessing
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
7
The association rules mining problem
Generate all association rules from Generate all association rules from the given dataset that have the given dataset that have
support greater than a specified support greater than a specified minimumminimum
and and confidence greater than a specified confidence greater than a specified
minimumminimum
8
Examples
Rule form Rule form LHS LHS RHS [confidence RHS [confidence
support]support] diapers diapers beer [60 05] beer [60 05]
ldquoldquo90 of transactions that purchase 90 of transactions that purchase bread and butter also purchase milkrdquobread and butter also purchase milkrdquo
bread and butter bread and butter milk [90 1] milk [90 1]
9
Example
Tr Items
T1 Beer Milk
T2 Bread Butter
T3 Bread Butter Jelly
T4 Bread Butter Milk
T5 Beer Bread
Itemset Support
Bread 80
Butter 60
Milk 40
Beer 40
Bread Butter 60
Large Itemsets with minsup=30
Consider the itemset
Bread Butter and the two possible rules
Bread Butter
Butter Bread
Support(Bread Butter)support(Bread = 75
ie Confidence(Bread Butter) = 75
Support(Bread Butter)support(Butter = 1
ie Confidence(Butter Bread) = 100
10
How Good is an Association Rule
Is support and confidence enoughIs support and confidence enough Lift (improvement) tells us how much better a Lift (improvement) tells us how much better a
rule is at predicting the result than just assuming rule is at predicting the result than just assuming the result in the first placethe result in the first place Lift = P(LHS^RHS) (P(LHS)P(RHS) Lift = P(LHS^RHS) (P(LHS)P(RHS)
When lift gt 1 then the rule is better at predicting When lift gt 1 then the rule is better at predicting the result than guessingthe result than guessing
When lift lt 1 the rule is doing worse than When lift lt 1 the rule is doing worse than informed guessing and using the informed guessing and using the Negative RuleNegative Rule produces a better rule than guessingproduces a better rule than guessing
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
8
Examples
Rule form Rule form LHS LHS RHS [confidence RHS [confidence
support]support] diapers diapers beer [60 05] beer [60 05]
ldquoldquo90 of transactions that purchase 90 of transactions that purchase bread and butter also purchase milkrdquobread and butter also purchase milkrdquo
bread and butter bread and butter milk [90 1] milk [90 1]
9
Example
Tr Items
T1 Beer Milk
T2 Bread Butter
T3 Bread Butter Jelly
T4 Bread Butter Milk
T5 Beer Bread
Itemset Support
Bread 80
Butter 60
Milk 40
Beer 40
Bread Butter 60
Large Itemsets with minsup=30
Consider the itemset
Bread Butter and the two possible rules
Bread Butter
Butter Bread
Support(Bread Butter)support(Bread = 75
ie Confidence(Bread Butter) = 75
Support(Bread Butter)support(Butter = 1
ie Confidence(Butter Bread) = 100
10
How Good is an Association Rule
Is support and confidence enoughIs support and confidence enough Lift (improvement) tells us how much better a Lift (improvement) tells us how much better a
rule is at predicting the result than just assuming rule is at predicting the result than just assuming the result in the first placethe result in the first place Lift = P(LHS^RHS) (P(LHS)P(RHS) Lift = P(LHS^RHS) (P(LHS)P(RHS)
When lift gt 1 then the rule is better at predicting When lift gt 1 then the rule is better at predicting the result than guessingthe result than guessing
When lift lt 1 the rule is doing worse than When lift lt 1 the rule is doing worse than informed guessing and using the informed guessing and using the Negative RuleNegative Rule produces a better rule than guessingproduces a better rule than guessing
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
9
Example
Tr Items
T1 Beer Milk
T2 Bread Butter
T3 Bread Butter Jelly
T4 Bread Butter Milk
T5 Beer Bread
Itemset Support
Bread 80
Butter 60
Milk 40
Beer 40
Bread Butter 60
Large Itemsets with minsup=30
Consider the itemset
Bread Butter and the two possible rules
Bread Butter
Butter Bread
Support(Bread Butter)support(Bread = 75
ie Confidence(Bread Butter) = 75
Support(Bread Butter)support(Butter = 1
ie Confidence(Butter Bread) = 100
10
How Good is an Association Rule
Is support and confidence enoughIs support and confidence enough Lift (improvement) tells us how much better a Lift (improvement) tells us how much better a
rule is at predicting the result than just assuming rule is at predicting the result than just assuming the result in the first placethe result in the first place Lift = P(LHS^RHS) (P(LHS)P(RHS) Lift = P(LHS^RHS) (P(LHS)P(RHS)
When lift gt 1 then the rule is better at predicting When lift gt 1 then the rule is better at predicting the result than guessingthe result than guessing
When lift lt 1 the rule is doing worse than When lift lt 1 the rule is doing worse than informed guessing and using the informed guessing and using the Negative RuleNegative Rule produces a better rule than guessingproduces a better rule than guessing
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
10
How Good is an Association Rule
Is support and confidence enoughIs support and confidence enough Lift (improvement) tells us how much better a Lift (improvement) tells us how much better a
rule is at predicting the result than just assuming rule is at predicting the result than just assuming the result in the first placethe result in the first place Lift = P(LHS^RHS) (P(LHS)P(RHS) Lift = P(LHS^RHS) (P(LHS)P(RHS)
When lift gt 1 then the rule is better at predicting When lift gt 1 then the rule is better at predicting the result than guessingthe result than guessing
When lift lt 1 the rule is doing worse than When lift lt 1 the rule is doing worse than informed guessing and using the informed guessing and using the Negative RuleNegative Rule produces a better rule than guessingproduces a better rule than guessing
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
11
The Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Also the number of transactions in a Also the number of transactions in a given time-period could also be huge given time-period could also be huge (hence expensive to analyze)(hence expensive to analyze)
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
12
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
13
Preparing Data for MBA
Determining scope of dataset (one Determining scope of dataset (one or many stores what period etc)or many stores what period etc)
Converting transaction data to Converting transaction data to itemsetsitemsets
Generalizing items to appropriate Generalizing items to appropriate levellevel Depends on objective of modelDepends on objective of model Rolling up rare items to get adequate Rolling up rare items to get adequate
supportsupport
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
14
Search Approach
Two sub-problems in discovering all association Two sub-problems in discovering all association rulesrules
Find all sets of items (itemsets) that have Find all sets of items (itemsets) that have transaction support above minimum supporttransaction support above minimum support
Itemsets that qualify are called Itemsets that qualify are called largelarge itemsets itemsets and and all others all others smallsmall itemsets itemsets
Generate from each large itemset rules that Generate from each large itemset rules that use items from the large itemsetuse items from the large itemset
Given a large itemset Given a large itemset YY and and XX is a subset of is a subset of YY Take the support of Take the support of YY and divide it by the support of and divide it by the support of XX If the ratio c is at least If the ratio c is at least minconfminconf then then XX ( (YY - - XX) is ) is
satisfied with confidence factor csatisfied with confidence factor c
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
15
Reducing Number of Candidates
Apriori principleApriori principle If an itemset is large then all of its If an itemset is large then all of its
subsets must also be largesubsets must also be large
Support of an itemset never exceeds the Support of an itemset never exceeds the support of its subsetssupport of its subsets
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
16
The Apriori Algorithm
Progressively Progressively identifies large identifies large itemsets of itemsets of different sizesdifferent sizes
Exploits the Exploits the property that any property that any subset of a large subset of a large itemset is also a itemset is also a large itemsetlarge itemset Also any superset Also any superset
of a small itemset of a small itemset is also smallis also small
A C DB
AB AC AD BC BD CD
ABC ABD ACD BCD
ABCD
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
17
Used in many recommender systems
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
18
Generating Rules
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
19
Terms
ldquoldquoIFrdquo part = IFrdquo part = antecedentantecedent
ldquoldquoTHENrdquo part = THENrdquo part = consequentconsequent
ldquoldquoItem setrdquo = the items (eg products) Item setrdquo = the items (eg products) comprising the antecedent or consequentcomprising the antecedent or consequent
Antecedent and consequent are Antecedent and consequent are disjointdisjoint (ie have no items in common)(ie have no items in common)
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
20
Tiny Example Phone Faceplates
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
21
Many Rules are Possible
For example Transaction 1 supports For example Transaction 1 supports several rules such as several rules such as ldquoldquoIf red then whiterdquo (ldquoIf a red faceplate If red then whiterdquo (ldquoIf a red faceplate
is purchased then so is a white onerdquo)is purchased then so is a white onerdquo) ldquoldquoIf white then redrdquoIf white then redrdquo ldquoldquoIf red and white then greenrdquoIf red and white then greenrdquo + several more+ several more
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
22
Frequent Item Sets
Ideally we want to create all possible Ideally we want to create all possible combinations of itemscombinations of items
ProblemProblem computation time grows computation time grows exponentially as items increasesexponentially as items increases
SolutionSolution consider only ldquofrequent item consider only ldquofrequent item setsrdquosetsrdquo
Criterion for frequent Criterion for frequent supportsupport
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
23
Support
SupportSupport = (or percent) of = (or percent) of transactions that include both the transactions that include both the antecedent and the consequentantecedent and the consequent
Example support for the item set Example support for the item set red white is 4 out of 10 red white is 4 out of 10 transactions or 40transactions or 40
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
24
Apriori Algorithm
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
25
Generating Frequent Item Sets
For For kk productshellip productshellip
11 User sets a minimum support criterionUser sets a minimum support criterion
22 Next generate list of one-item sets that Next generate list of one-item sets that meet the support criterionmeet the support criterion
33 Use the list of one-item sets to generate Use the list of one-item sets to generate list of two-item sets that meet the list of two-item sets that meet the support criterionsupport criterion
44 Use list of two-item sets to generate list Use list of two-item sets to generate list of three-item setsof three-item sets
55 Continue up through Continue up through kk-item sets-item sets
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
26
Measures of Performance
ConfidenceConfidence the of antecedent transactions the of antecedent transactions that also have the consequent item setthat also have the consequent item set
LiftLift = = confidenceconfidence((benchmark confidencebenchmark confidence))
Benchmark confidenceBenchmark confidence = transactions with = transactions with consequent as of all transactionsconsequent as of all transactions
Lift gt 1 indicates a rule that is useful in finding Lift gt 1 indicates a rule that is useful in finding consequent items sets (ie more useful than just consequent items sets (ie more useful than just selecting transactions randomly)selecting transactions randomly)
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
27
Alternate Data Format Binary Matrix
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
28
Process of Rule Selection
Generate all rules that meet Generate all rules that meet specified support amp confidencespecified support amp confidence
Find frequent item sets (those with Find frequent item sets (those with sufficient support ndash see above)sufficient support ndash see above)
From these item sets generate rules From these item sets generate rules with sufficient confidencewith sufficient confidence
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
29
Example Rules from red white green
red white gt green with confidence = 24 = 50 red white gt green with confidence = 24 = 50 [(support red white green)(support red white)][(support red white green)(support red white)]
red green gt white with confidence = 22 = 100red green gt white with confidence = 22 = 100 [(support red white green)(support red green)][(support red white green)(support red green)]
Plus 4 more with confidence of 100 33 29 amp 100Plus 4 more with confidence of 100 33 29 amp 100
If confidence criterion is 70 report only rules 2 3 and 6If confidence criterion is 70 report only rules 2 3 and 6
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
30
All Rules (XLMiner Output)
Rule Conf Antecedent (a) Consequent (c) Support(a) Support(c) Support(a U c) Lift Ratio1 100 Green=gt Red White 2 4 2 252 100 Green=gt Red 2 6 2 16666673 100 Green White=gt Red 2 6 2 16666674 100 Green=gt White 2 7 2 14285715 100 Green Red=gt White 2 7 2 14285716 100 Orange=gt White 2 7 2 1428571
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
31
Interpretation
Lift ratio Lift ratio shows how effective the rule is shows how effective the rule is in finding consequents (useful if finding in finding consequents (useful if finding particular consequents is important)particular consequents is important)
ConfidenceConfidence shows the rate at which shows the rate at which consequents will be found (useful in consequents will be found (useful in learning costs of promotion) learning costs of promotion)
SupportSupport measures overall impact measures overall impact
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
32
Caution The Role of Chance
Random data can generate Random data can generate apparently interesting association apparently interesting association rulesrules
The more rules you produce the The more rules you produce the greater this dangergreater this danger
Rules based on large numbers of Rules based on large numbers of records are less subject to this dangerrecords are less subject to this danger
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
33
Market Basket Analysis
MBA is a set of techniques MBA is a set of techniques Association Rules being most Association Rules being most common that focus on point-of-sale common that focus on point-of-sale (p-o-s) transaction data(p-o-s) transaction data
3 types of market basket data (p-o-s 3 types of market basket data (p-o-s data)data) CustomersCustomers Orders (basic purchase data)Orders (basic purchase data) Items (merchandiseservices Items (merchandiseservices
purchased)purchased)
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
34
Market Basket Analysis
Retail ndash each customer purchases different set Retail ndash each customer purchases different set of products different quantities different of products different quantities different timestimes
MBA uses this information toMBA uses this information to Identify who customers are (not by name)Identify who customers are (not by name) Understand why they make certain purchasesUnderstand why they make certain purchases Gain insight about its merchandise (products)Gain insight about its merchandise (products)
Fast and slow moversFast and slow movers Products which are purchased togetherProducts which are purchased together Products which might benefit from promotionProducts which might benefit from promotion
Take actionTake action Store layoutsStore layouts Which products to put on specials promote couponshellipWhich products to put on specials promote couponshellip
Combining all of this with a customer loyalty Combining all of this with a customer loyalty card it becomes even more valuablecard it becomes even more valuable
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
35
Association Rules
DM technique most closely allied DM technique most closely allied with Market Basket Analysiswith Market Basket Analysis
AR can be automatically AR can be automatically generatedgenerated AR represent patterns in the data AR represent patterns in the data
without a specified target variablewithout a specified target variable Good example of undirected data Good example of undirected data
miningmining
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
36
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
37
Market Basket Analysis Measures
Consider the association rule Y 1048782 Z where Y and Z are two products Y Consider the association rule Y 1048782 Z where Y and Z are two products Y represents the antecedent en Z is called the consequentrepresents the antecedent en Z is called the consequent
Support Support of the rule the percentage of all baskets that contain both of the rule the percentage of all baskets that contain both product Y and Zproduct Y and Zsupport = P(Y Λ Z)support = P(Y Λ Z)
Confidence Confidence of the rule the percentage of all the baskets containing Y that of the rule the percentage of all the baskets containing Y that also contain Zalso contain ZHence confidence is a conditional probability ie P(Z|Y)Hence confidence is a conditional probability ie P(Z|Y)confidence = P(Y Λ Z)P(Y)confidence = P(Y Λ Z)P(Y)
Interest Interest of the rule measures the statistical dependence of the rule by of the rule measures the statistical dependence of the rule by relating the observed frequency of occurrence (P(Y Λ Z)) to the expected relating the observed frequency of occurrence (P(Y Λ Z)) to the expected frequency of co-occurrence under the assumption of conditional frequency of co-occurrence under the assumption of conditional independence of Y and Z (P(Y)P(Z))independence of Y and Z (P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))interest = P(Y Λ Z)(P(Y)P(Z))
Association-rule discovery is the process of finding strong product Association-rule discovery is the process of finding strong product associations with aassociations with aminimum support andor confidence and an interest of at least oneminimum support andor confidence and an interest of at least one
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
38
Association Rules Apply Elsewhere
Besides retail ndash supermarkets etchellipBesides retail ndash supermarkets etchellip Purchases made using creditdebit Purchases made using creditdebit
cardscards Optional Telco Service purchasesOptional Telco Service purchases Banking servicesBanking services Unusual combinations of insurance Unusual combinations of insurance
claims can be a warning of fraudclaims can be a warning of fraud Medical patient historiesMedical patient histories
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
39
A certainty measure for A certainty measure for association rules of the form ldquoA association rules of the form ldquoA =gt Brdquo where A and B are sets of =gt Brdquo where A and B are sets of items is confidenceitems is confidence
Given a set of task Given a set of task
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
40
Typical Data Structure (Relational Database)
Lots of questions can be answeredLots of questions can be answered Avg of orderscustomerAvg of orderscustomer Avg unique itemsorderAvg unique itemsorder Avg of itemsorderAvg of itemsorder For a productFor a product
What of customers have purchasedWhat of customers have purchased Avg orderscustomer include itAvg orderscustomer include it Avg quantity of it purchasedorderAvg quantity of it purchasedorder
EtchellipEtchellip Visualization is extremely helpfulVisualization is extremely helpful
Transaction Data
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
41
Sales Order Characteristics
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
42
Sales Order Characteristics
Did the order use gift wrapDid the order use gift wrap Billing address same as Shipping addressBilling address same as Shipping address Did purchaser acceptdecline a cross-sellDid purchaser acceptdecline a cross-sell What is the most common item found on a What is the most common item found on a
one-item orderone-item order What is the most common item found on a What is the most common item found on a
multi-item ordermulti-item order What is the most common item for repeat What is the most common item for repeat
customer purchasescustomer purchases How has ordering of an item changed over How has ordering of an item changed over
timetime How does the ordering of an item vary How does the ordering of an item vary
geographicallygeographically
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
43
Association Rules
Wal-Mart customers who purchase Wal-Mart customers who purchase Barbie dolls have a 60 likelihood of Barbie dolls have a 60 likelihood of also purchasing one of three types of also purchasing one of three types of candy bars candy bars
Customers who purchase maintenance Customers who purchase maintenance agreements are very likely to purchase agreements are very likely to purchase large appliances When a new hardware large appliances When a new hardware store opens one of the most commonly store opens one of the most commonly sold items is toilet bowl cleanerssold items is toilet bowl cleaners
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
44
Association Rules
Association rule typesAssociation rule types Actionable Rules ndash contain high-Actionable Rules ndash contain high-
quality actionable informationquality actionable information Trivial Rules ndash information already Trivial Rules ndash information already
well-known by those familiar with well-known by those familiar with the businessthe business
Inexplicable Rules ndash no explanation Inexplicable Rules ndash no explanation and do not suggest actionand do not suggest action
Trivial and Inexplicable Rules Trivial and Inexplicable Rules occur most oftenoccur most often
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
45
How Good is an Association Rule
CustomerCustomer Items PurchasedItems Purchased
11 Coke sodaCoke soda
22 Milk Coke window cleanerMilk Coke window cleaner
33 Coke detergentCoke detergent
44 Coke detergent sodaCoke detergent soda
55 Window cleaner sodaWindow cleaner soda
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
CokeCoke 44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
POS Transactions
Co-occurrence ofProducts
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
46
How Good is an Association Rule
CokCokee
Window Window cleanercleaner
MilkMilk SodaSoda DetergentDetergent
44 11 11 22 22
Window cleanerWindow cleaner 11 22 11 11 00
MilkMilk 11 11 11 00 00
SodaSoda 22 11 00 33 11
DetergentDetergent 22 00 00 11 22
Simple patterns1 Coke and soda are more likely purchased together thanany other two items2 Detergent is never purchased with milk or window cleaner3 Milk is never purchased with soda or detergent
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
47
How Good is an Association Rule
What is the confidence for this ruleWhat is the confidence for this rule If a customer purchases soda then customer also purchases CokeIf a customer purchases soda then customer also purchases Coke 2 out of 3 soda purchases also include Coke so 672 out of 3 soda purchases also include Coke so 67
What about the confidence of this rule reversedWhat about the confidence of this rule reversed 2 out of 4 Coke purchases also include soda so 502 out of 4 Coke purchases also include soda so 50
Confidence Confidence = Ratio of the number of transactions with all the = Ratio of the number of transactions with all the items to the number of transactions with just the ldquoifrdquo itemsitems to the number of transactions with just the ldquoifrdquo items
Customer Items Purchased
1 Coke soda
2 Milk Coke window cleaner
3 Coke detergent
4 Coke detergent soda
5 Window cleaner soda
POS Transactions
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
48
How Good is an Association Rule
How much better than chance is a ruleHow much better than chance is a rule Lift (improvement) tells us how much better a rule is at Lift (improvement) tells us how much better a rule is at
predicting the result than just assuming the result in the predicting the result than just assuming the result in the first placefirst place
Lift Lift is the ratio of the records that support the entire rule to is the ratio of the records that support the entire rule to the number that would be expected assuming there was no the number that would be expected assuming there was no relationship between the productsrelationship between the products
Calculating lifthellipWhen lift gt 1 then the rule is better at Calculating lifthellipWhen lift gt 1 then the rule is better at predicting the result than guessingpredicting the result than guessing
When lift lt 1 the rule is doing worse than informed When lift lt 1 the rule is doing worse than informed guessing and using the guessing and using the Negative RuleNegative Rule produces a better produces a better rule than guessingrule than guessing
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
49
Creating Association Rules
11 Choosing the right set Choosing the right set of itemsof items
22 Generating rules by Generating rules by deciphering the deciphering the counts in the co-counts in the co-occurrence matrixoccurrence matrix
33 Overcoming the Overcoming the practical limits practical limits imposed by thousands imposed by thousands or tens of thousands or tens of thousands of unique itemsof unique items
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
50
Overcoming Practical Limits for Association Rules
11 Generate co-occurrence matrix Generate co-occurrence matrix for single itemshelliprdquofor single itemshelliprdquoif Coke then if Coke then sodardquosodardquo
22 Generate co-occurrence matrix Generate co-occurrence matrix for two itemshelliprdquofor two itemshelliprdquoif Coke and Milk if Coke and Milk then sodardquothen sodardquo
33 Generate co-occurrence matrix Generate co-occurrence matrix for three itemshelliprdquofor three itemshelliprdquoif Coke and Milk if Coke and Milk and Windowand Window Cleanerrdquo then soda Cleanerrdquo then soda
44 EtchellipEtchellip
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
51
Final Thought on Association RulesThe Problem of Lots of Data
Fast Food Restauranthellipcould have 100 Fast Food Restauranthellipcould have 100 items on its menuitems on its menu How many combinations are there with 3 How many combinations are there with 3
different menu items 161700 different menu items 161700 Supermarkethellip10000 or more unique Supermarkethellip10000 or more unique
itemsitems 50 million 2-item combinations50 million 2-item combinations 100 billion 3-item combinations100 billion 3-item combinations
Use of product hierarchies (groupings) Use of product hierarchies (groupings) helps address this common issuehelps address this common issue
Finally know that the number of Finally know that the number of transactions in a given time-period could transactions in a given time-period could also be huge (hence expensive to analyze)also be huge (hence expensive to analyze)
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
52
Business and other cases
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
53
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
54
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
55
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
56
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
57
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
58
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
59
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
60
General Observations
Banking case seems to provide Banking case seems to provide well defined and intelligible well defined and intelligible information of the forminformation of the form account_1 and account_2 etc or account_1 and account_2 etc or
activity_1 and activity_2 etc activity_1 and activity_2 etc possibly indexed by timepossibly indexed by time
As such rules found provide guide As such rules found provide guide to action to offer product or service to action to offer product or service (cross-sell)(cross-sell)
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
61
In retailing case of items In retailing case of items purchased together guidance is purchased together guidance is not so clear cut due to extensive not so clear cut due to extensive number of rulesnumber of rules
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
62
Challenges
A major difficulty is that a large number of A major difficulty is that a large number of the rules found may be trivial for anyone the rules found may be trivial for anyone familiar with the business familiar with the business
The computational complexity involved in The computational complexity involved in calculating the results of market basket calculating the results of market basket analysis is at least the square of the number analysis is at least the square of the number of transaction item-lines (records of every of transaction item-lines (records of every item purchased) With data warehouses item purchased) With data warehouses storing billions of transaction lines this storing billions of transaction lines this yields extremely high computational yields extremely high computational requirements requirements
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
63
Solutions
Differential market basket analysisDifferential market basket analysis can find interesting results and can also can find interesting results and can also eliminate the problem of a potentially eliminate the problem of a potentially high volume of trivial resultshigh volume of trivial results
Special techniques involving Special techniques involving filtering filtering or aggregationor aggregation of the transaction of the transaction database are commonly used to in database are commonly used to in analysis algorithms to increase analysis algorithms to increase performance and allow some level of performance and allow some level of interactivity such as in business interactivity such as in business intelligence applicationsintelligence applications
64
Thank You
64
Thank You