market basket analysis
DESCRIPTION
Market basket Analysis,Apriori algorithmTRANSCRIPT
![Page 1: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/1.jpg)
MARKET BASKET ANALYSIS USING
R TOOL
Gaurav MittalDOMS-NITT
![Page 2: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/2.jpg)
What is Market Basket Analysis?
Understanding behavior of shoppers What items are bought together
What’s in each shopping cart/basket?
Basket data consist of collection of transaction date and items bought in a transaction Itemset
Retail organizations interested in generating qualified decisions and strategy based on analysis of transaction data what to put on sale, how to place merchandise on shelves for
maximizing profit, customer segmentation based on buying pattern
![Page 3: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/3.jpg)
Market Basket Analysis
MBA uses this information to: Identify who customers are (not by name) Understand why they make certain purchases Gain insight about its merchandise (products):
Fast and slow movers Products which are purchased together Products which might benefit from promotion
Take action: Store layouts Which products to put on specials, promote, coupons…
Combining all of this with a customer loyalty card it becomes even more valuable
![Page 4: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/4.jpg)
Examples
Rule form: LHS RHS IF a customer buys diapers, THEN they also buy beer
diapers beer
“Transactions that purchase bread and butter also purchase milk”
bread butter milk
Customers who purchase maintenance agreements are very likely to purchase large appliances
When a new hardware store opens, one of the most commonly sold items is toilet bowl cleaners
![Page 5: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/5.jpg)
Def: Market Basket Analysis (Association Analysis) is a mathematical modeling technique based upon the theory that if you buy a certain group of items, you are likely to buy another group of items.
It is used to analyze the customer purchasing behavior and helps in increasing the sales and maintain inventory by focusing on the point of sale transaction data.
![Page 6: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/6.jpg)
Definitions and Terminology
Transaction is a set of items (Itemset). Confidence : It is the measure of uncertainty or trust
worthiness associated with each discovered pattern. Support : It is the measure of how often the collection of items
in an association occur together as percentage of all transactions
Frequent itemset : If an itemset satisfies minimum support,then it is a frequent itemset.
Strong Association rules: Rules that satisfy both a minimum support threshold and a minimum confidence threshold
In Association rule mining, we first find all frequent itemsets and then generate strong association rules from the frequent itemsets
![Page 7: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/7.jpg)
Market Basket Analysis General Concept: methods
_____________________________
Method:
Transaction 1: Frozen pizza, cola, milk Transaction 2: Milk, potato chips Transaction 3: Cola, frozen pizza Transaction 4: Milk, pretzels Transaction 5: Cola, pretzels
Frozen
Pizza Milk ColaPotato
ChipsPretzel
s
Frozen Pizza 2 1 2 0 0
Milk 1 3 1 1 1
Cola 2 1 3 0 1
Potato Chips 0 1 0 1 0
Pretzels 0 1 1 0 2
Results:
we could derive the association rules: If a customer purchases Frozen Pizza, then they will probably purchase Cola. If a customer purchases Cola, then they will probably purchase Frozen Pizza.
![Page 8: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/8.jpg)
Market Basket Analysis General Concept: Measures Support : measure of how often the collection of items
in an association occur together as a percentage of all the transactions support = (containing the item combination) /( total number of record.) Let the rule Is "If a customer purchases Cola, then they will purchase Frozen
Pizza“ The support for this
= 2 (number of transaction that include both Cola and Frozen Pizza is) / 5(total records )
= 40%.
Confidence : confidence of rule “B given A” is a measure of how much more likely it is that B occurs when A has occurred 100% meaning that B always occurs if A has occurred Confidence of a rule = the support for the combination / the support for the
condition. For the rule "If a customer purchases Milk, then they will purchase
Potato Chips" confidence = support for the combination (Potato Chips + Milk) is 20%/
support for the condition (Milk) is 60%, =33%
![Page 9: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/9.jpg)
Association Rules Apply Elsewhere
Retail – supermarkets, etc… Purchases made using credit/debit cards. Optional Telco Service purchases. Banking services. Unusual combinations of insurance claims can be
a warning of fraud. Medical patient histories. Restaurants and Fast-food Centre.
![Page 10: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/10.jpg)
Preparing Data for MBA
Determining scope of dataset (one or many stores, what period, etc)
Converting transaction data to itemsets Generalizing items to appropriate level
Depends on objective of modelRolling up rare items to get adequate support
![Page 11: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/11.jpg)
INTRODUCTION TO R
R is a programming language and software environment for statistical computing and graphics.
R is part of the GNU project. Its source code is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems.
R uses a command line interface, though several graphical user interfaces are available.
Comprehensive R Archive Network (CRAN) makes it easy to benefit from others’ work and to share your work and get feedback on potential improvements
![Page 12: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/12.jpg)
For computationally-intensive tasks, C, C++, and Fortran code can be linked and called at run time.
R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others) and graphical techniques.
Another of R's strengths is its graphical facilities, which produce publication-quality graphs which can include mathematical symbols.
Although R is mostly used by statisticians and other practitioners requiring an environment for statistical computation and software development, it can also be used as a general matrix calculation toolbox with comparable benchmark results to GNU Octave and its proprietary counterpart, MATLAB
![Page 13: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/13.jpg)
THE R ENVIRONMENT
R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes an effective data handling and storage facility, a suite of operators for calculations on arrays, in particular matrices, a large, coherent, integrated collection of intermediate tools for data
analysis, and graphical facilities for data analysis and display either on-screen or on
hardcopy.
Packages The capabilities of R are extended through user-submitted packages,
which allow specialized statistical techniques, graphical devices, as well as and import/export capabilities to many external data formats.
A statistical package is a suite of computer programs that are specialised for statistical analysis. It enables people to obtain the results of standard statistical procedures and statistical significance tests, without requiring low-level numerical programming.
![Page 14: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/14.jpg)
Process Methodology The data is obtained from the excel sheet
provided by the customer.
![Page 15: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/15.jpg)
Each row contains- BUS_DT - Bussiness Date REST_NO – Restaurant Number RTL_TRAN_NO – Transaction Numbrer MENU_ITEM_KEY – Product Key Number MENU_ITEM_PLU – Menu Product Number MENU_ITEM_NAME – Product Name RCPT_DT_TMSTP – Date Of Transaction HALF_HOUR_KEY – The half hour in which the transaction occurred. COMBO_IND – Is the product offered with something else SERVICE_MODE_CODE – Eating / Taken CGY – Category RGLR_PRC – Regular Price DRV_PRC – Derived Price ITEM_QTY – Number of Products Ordered
![Page 16: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/16.jpg)
Products offered at the store
WHOPPER TENDERCRISP Chicken Sandwich Crown-shaped CHICKEN TENDERS French Fries Hamburger Cheeseburger DOUBLE CROISSAN'WICH BK BURGER SHOTS KRAFT Macaroni and Cheese Drinks
![Page 17: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/17.jpg)
Changing the given data in a new format that contains all items purchased in a single transaction.
Done by using VLOOKUP function in excel. The data obtained is re structured to remove the
multiple line of the same transaction using if…then method in excel.
The data is ready to be fed for statistical application.
![Page 18: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/18.jpg)
Working in R
Downloading Rcmndr, which is a GUI, and Apriori or Association rules package from the CRAN.
A GUI is run named as Rcmndr, to load the data in the software, or the data can be directly loaded using the command functions.
<-Dataset <- read.table("C:/Users/mittal/Documents/mittal.csv", header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)
![Page 19: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/19.jpg)
loading package Arules library("arules")
To inspect the transactions. <-inspect(Dataset)
![Page 20: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/20.jpg)
Next, we call the function apriori() to find all rules (the default association type for apriori()) with a minimum support of 1% and a confidence of 0.6.
> rules <- arules(Adult, parameter = list(support = 0.01,
+ confidence = 0.6))
Asking for the rules > rules
Getting the Summary of the rules > summary(rules_whopper) > rules_whopper <- subset(rules, subset = rhs %in%
"income=small" &
+ lift > 1.2) > rules_hamburger <- subset(rules, subset = rhs %in%
"income=large" &
+ lift > 1.2)
![Page 21: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/21.jpg)
The recommendations Whopper can be bundled with coke, minute
maid orange juice, French toast stick. Cheeseburger can be bundled with the
French fries, onion rings. French fries with HERSHEY®'S Fat Free Milk. Dutch Apple Pie with Bacon, Egg & Cheese
Biscuit Sandwich.
![Page 22: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/22.jpg)
Challenges…!!!
Cannot load data more than 799 rows. R software is usable only for learning
purpose but difficult for industrial purpose where large amount of data to be analyzed.
Limited knowledge available for guiding analysis development in R.
New codes has to be developed for extending the database.
![Page 23: Market Basket Analysis](https://reader033.vdocuments.mx/reader033/viewer/2022042521/546ac68cb4af9fbd508b471c/html5/thumbnails/23.jpg)
Thank You