association rules (market basket analysis)
DESCRIPTION
Retail shops are often interested in associations between different items that people buy. Someone who buys bread is quite likely also to buy milk A person who bought the book Database System Concepts is quite likely also to buy the book Operating System Concepts . - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/1.jpg)
Association RulesAssociation Rules(market basket analysis)(market basket analysis)
Retail shops are often interested in associations between different items that people buy. • Someone who buys bread is quite likely also to buy milk
• A person who bought the book Database System Concepts is quite likely also to buy the book Operating System Concepts.
Associations information can be used in several ways. • E.g. when a customer buys a particular book, an online shop may
suggest associated books.
Association rules:
bread milk DB-Concepts, OS-Concepts Networks• Left hand side: antecedent, right hand side: consequent
• An association rule must have an associated population; the population consists of a set of instances
• E.g. each transaction (sale) at a shop is an instance, and the set of all transactions is the population
![Page 2: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/2.jpg)
Association Rule DefinitionsAssociation Rule Definitions
Set of items: I={I1,I2,…,Im}
Transactions: D={t1,t2, …, tn}, tj I
Itemset: {Ii1,Ii2, …, Iik} I
Support of an itemset: Percentage of transactions which contain that itemset.
Large (Frequent) itemset: Itemset whose number of occurrences is above a threshold.
![Page 3: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/3.jpg)
Association Rules ExampleAssociation Rules Example
I = { Beer, Bread, Jelly, Milk, PeanutButter}
![Page 4: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/4.jpg)
Association Rule DefinitionsAssociation Rule Definitions
Association Rule (AR): implication X Y where X,Y I and X Y = the null set;
Support of AR (s) X Y: Percentage of transactions that contain X Y
Confidence of AR () X Y: Ratio of number of transactions that contain X Y to the number that contain X
![Page 5: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/5.jpg)
Association Rules Ex (cont’d)Association Rules Ex (cont’d)
![Page 6: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/6.jpg)
Association Rules Ex (cont’d)Association Rules Ex (cont’d)
Of 5 transactions, 3 involve both Bread and PeanutButter, 3/5 = 60%
Of the 4 transactions that involve Bread, 3 of them also involve PeanutButter 3/4 = 75%
![Page 7: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/7.jpg)
Association Rule ProblemAssociation Rule Problem
Given a set of items I={I1,I2,…,Im} and a database of transactions D={t1,t2, …, tn} where ti={Ii1,Ii2, …, Iik} and Iij I, the Association Rule Problem is to identify all association rules X Y with a minimum support and confidence (supplied by user).
NOTE: Support of X Y is same as support of X Y.
![Page 8: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/8.jpg)
Association Rule Algorithm (Basic Idea)Association Rule Algorithm (Basic Idea)
1. Find Large Itemsets.
2. Generate rules from frequent itemsets.
This is the simple naïve algorithm, better algorithms exist.
![Page 9: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/9.jpg)
Association Rule AlgorithmAssociation Rule Algorithm
We are generally only interested in association rules with reasonably high support (e.g. support of 2% or greater)
Naïve algorithm
1. Consider all possible sets of relevant items.
2. For each set find its support (i.e. count how many transactions purchase all items in the set).
• Large itemsets: sets with sufficiently high support
• Use large itemsets to generate association rules.
• From itemset A generate the rule A - {b} b for each b A.
• Support of rule = support (A).
• Confidence of rule = support (A ) / support (A - {b})
![Page 10: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/10.jpg)
• From itemset A generate the rule A - {b} b for each b A.
• Support of rule = support (A).
• Confidence of rule = support (A ) / support (A - {b})
Lets say itemset A = {Bread, Butter, Milk}
Then A - {b} b for each b A includes 3 possibilities
{Bread, Butter} Milk
{Bread, Milk} Butter
{Butter, Milk} Bread
![Page 11: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/11.jpg)
AprioriApriori
Large Itemset Property:
Any subset of a large itemset is large.
Contrapositive:
If an itemset is not large,
none of its supersets are large.
![Page 12: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/12.jpg)
Large Itemset PropertyLarge Itemset Property
![Page 13: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/13.jpg)
Large Itemset PropertyLarge Itemset Property
If B is not frequent, then none of the supersets of B can be frequent.
If {ACD} is frequent, then all subsets of {ACD} ({AC}, {AD}, {CD}) must be frequent.
If {ACD} is frequent, then all subsets of ({A}, {A}, {C}) must be frequent.
![Page 14: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/14.jpg)
My Personal View of Association Rules My Personal View of Association Rules
Vastly over studied problem, of dubious utility
![Page 15: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/15.jpg)
Student PresentationsStudent Presentations
Starting next week students will be giving presentations
Presentation can be on
The student project
A paper chosen by the student (per my approval)
The presentation should last 8 to15 minutes. You need to tell me in advance how long the talk will be.
You must email me the slides by midnight, before the talk
There will be a signup sheet (topic and date) on my door tomorrow.
![Page 16: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/16.jpg)
Tips for Giving a Good TalkTips for Giving a Good Talk
Winter 2003Winter 2003
Dr Eamonn KeoghDr Eamonn KeoghComputer Science & Engineering Department
University of California - RiversideRiverside,CA [email protected]
Modified from the notes of Edward R. Tufte, Craig S. Kaplan, Eamonn Keogh and others
![Page 17: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/17.jpg)
OutlineOutline
Advice on giving talksAdvice on giving talks
• General advice• Organization• Making clear overheads• Avoiding common pitfalls
ConclusionConclusion
![Page 18: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/18.jpg)
• Show up early. You may have a chance to head off some technical or ergonomic problem.
• Have a backup plan. If your lecture is based on a PowerPoint presentation, have overhead backups of each page.
• Check out the room ahead of time. Before your talk, check out the room, and make sure it has everything you need.
General Advice IGeneral Advice I
![Page 19: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/19.jpg)
•Never apologize. Most people wouldn’t have noticed the issues for which you’re apologizing—and it just sounds lame.
• Invest in a laser pointer. They are inexpensive, and are extremely useful.
• Rehearse timing. This is the most common sin!!!
General Advice IIGeneral Advice II
![Page 20: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/20.jpg)
Overheads IOverheads I
• Use large fonts. Use the biggest fonts realistically possible. Small fonts are hard to read
• Use highly contrasting colors.
• Avoid busy backgrounds. Too much in the background makes the text hard to read
![Page 21: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/21.jpg)
Overheads IIOverheads II
• Avoid using red text. Red text is often hard to read.
• AVOID ALL CAPS! All caps look like you're shouting.
…Include a good combination of words, pictures, and graphics. A variety keeps the presentation interesting
…Include a good combination of words, pictures, and graphics. A variety keeps the presentation interesting
![Page 22: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/22.jpg)
Overheads IIIOverheads III
• Be Terse
• The sales forecasts show an increase on the horizon. • Sales are up.
• Use bullets or numbered items appropriately
Goals• Ease of use • Reusability • Reliability
Outline of our method1. Design 2. Implementation 3. Testing
![Page 23: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/23.jpg)
Overheads IIIIOverheads IIII
• Begin with an introduction slide (Who you are, why you are giving a talk, the title of the talk)
• Next, give an outline (“roadmap”). For a short talk, you might want to combine this with the above
• State your point (one simple slide)• Demonstrate your point (a few slides)• Review your point (one simple slide)
![Page 24: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/24.jpg)
Overheads VOverheads V
• End with a slide that reviews the entire talk…
• We introduced the TSP problem• We explained why it is an important problem• We explained why it is a hard problem• We introduced a new heuristic to solve TSP• We empirically demonstrated the utility of our approach
• End “cleanly”, don’t fade away.
![Page 25: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/25.jpg)
Overheads VIOverheads VI
• Avoid using “standard” clipart/ background etc
I have seen this at least 20 times in conference presentations.
![Page 26: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/26.jpg)
Overheads VIIOverheads VII
• Be careful with Acronyms…
C_max
C_min
Rangei, Diameteri
R1, D1
R2, D2
Neighboring Unlabeled Token:
sskh f dhfa
![Page 27: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/27.jpg)
Annoying Personal Habits IAnnoying Personal Habits I(This means you)(This means you)
• Playing with jewelry • Licking and/or biting your lips • Constantly adjusting your glasses • Popping the top of a pen • Playing with facial hair (men)• Playing with/twirling your hair (women)
![Page 28: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/28.jpg)
Annoying Personal Habits IIAnnoying Personal Habits II(This means you)(This means you)
• Jingling change in your pocket • Leaning against anything for support• Fillers: “ah”, “um”, and “and”• Starting every sentence with the same word • Sticky floor syndrome• Avoiding eye contact• Lack of enthusiasm “Basically” and
“essentially” seem to be the current favorites.
![Page 29: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/29.jpg)
ConclusionConclusion
• We have motivated the need for a high quality talk
• We have seen various tips on creating high quality overheads
• We have seen various hints on avoiding common pitfalls
![Page 30: Association Rules (market basket analysis)](https://reader035.vdocuments.mx/reader035/viewer/2022062422/56813050550346895d95ff3a/html5/thumbnails/30.jpg)
Questions?Questions?
Dr Eamonn KeoghDr Eamonn KeoghComputer Science & Engineering Department
University of California - RiversideRiverside,CA [email protected]