real-time market basket analysis for retail with hadoop

19
Real-Time Market Basket Analysis for Retail with Hadoop Simone Ferruzzi and Marco Mantovani Iconsulting Spa

Upload: hadoopsummit

Post on 22-Nov-2014

2.638 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Real-time Market Basket Analysis for Retail with Hadoop

Real-Time Market Basket Analysis for Retail with Hadoop

Simone Ferruzzi and Marco MantovaniIconsulting Spa

Page 2: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Real-Time Market Basket Analysis for Retail with

Hadoop

Page 3: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

ICONSULTING

ICONSULTING IS AN INDEPENDENT CONSULTING COMPANY SPECIALIZED IN DWH,BI & PM

Strong expertise on all the market leading technologies

INNOVATIVE SPECIALIZEDDEVELOPING

SKILLSVENDOR

INDEPENDENT

2 3 41

WHOWE ARE

More than 300 projects; more than 100 customers

Professorship in main Italian Universities and Business Schools

In-house Academy providing education services to professionals who need to develop their skills

Spin-off of a major Research University Consortium

25% of our time invested in R&D

Certified Partner of the main Business Intelligence software vendors

# Data Warehouse # Business Intelligence# Performance Management

Page 4: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

PROCEDURES & OPERATING INSTRUCTIONS ACCORDING TO ISO 9001:2008

STEP BY STEPAPPROACH

PROJECT REQUIREMENT & RESTRAINTS

SERVICEQUALITY

TIME & COSTSEXECUTION

MEETINGDEADLINES

PROBLEMS & RISKSMANAGEMENT

COMMUNICATION AMONG STAKEHOLDERS

WHAT REALLY COUNTS

AGILEDESIGN THINKINGMETHODOLOGY

ICONSULTING Methodology

Page 5: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

OurCUSTOMERS

MANUFACTURINGALFA WASSERMANNAMPLIFONARISTON THERMOCAMAR SMACANTIERI SANLORENZOCASE NEW HOLLANDFEDRIGONIG.DCISA (Ingersoll-Rand) DUCATI MOTOR HOLDINGESSECOFIAMMFONTANOTGRUPPO COESIAGRUPPO FABBRIICF - LA FAENZAIGUZZINII.M.A. INDUSTRIA MACCHINE AUTOMATICHE INTERTABA - PHILIP MORRISKME KOMATSULOWARAMAGNETI MARELLIMALAVOLTA CORPORATEMAPEIMARAZZIMARPOSSNEGRI BOSSIOVA BARGELLINIOTISPHILIP MORRIS ITALIAPIRELLIPOZZI GINORIROSETTI MARINOSACMISECISONY EUROPATEUCO GUZZINI UNO A ERREVINAVIL

MEDIA & PUBLISHINGPANINI GROUPSKY ITALIAVODAFONEZANICHELLI EDITORE

GOVERNMENT & PUBLIC SECTORMINISTERO DELL’INTERNOMINISTERO DEL LAVORO E DELLE POLITICHE SOCIALIREGIONE EMILIA ROMAGNA REGIONE CALABRIA REGIONE VENETO AGREA ARPA ARPATCESIACOMUNE DI BOLOGNA COMUNE DI REGGIO EMILIAERVETINVITALIAI.S.P.R.A. AMBIENTEISTITUTO NAZIONALE FISICA NUCLEARELEPIDAPROV. AUTONOMA DI BOLZANOPROV. AUTONOMA DI TRENTOPROVINCIA DI RIMINIUNIVERSITA’ DI BOLOGNA

SERVICESDAY RISTOSERVICEGRUPPO SOCIETA’ GAS RIMINIMOBYRINASIENAMBIENTESOFIS

FASHIONCALZEDONIADIESELGEOXGUCCIIMAXLOTTOMILAR

FINANCIAL SERVICESCREDIT SUISSEDEXIA CREDIOPFGA CAPITAL (GRUPPO FIAT)UNIPOL BANCA

FOODBIRRA PERONIERIDANIA SADAMGRANDI SALUMIFICI ITALIANIMASSIMO ZANETTI BEVERAGE GROUPMONTENEGROSALUMIFICIO FRATELLI BERETTASEGAFREDO

LARGE SCALE RETAILCONAD ADRIATICOLA RINASCENTESMA (SIMPLY MARKET)VIP CATERING

Page 6: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Business Intelligence

Turning data into Information

Historicize and Organize Information

Facilitating access to information

Evolution Trends (Big Data)

+ end users + informations + performance

Analytics and

Business Intelligenc

e

Mobile Technologie

s

Cloud Computing

Collaboration

Technologies

Connect analysis to Action

Analyze data in Real Time

Self-service BI

Advanced visualization (mapping, etc.)

New data type (unstructured data / text)

Information Discovery on Big Data

New channels of access (Mobile)

Collaboration & Social

Page 7: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Market Basket Analysis for Retail

Client:Major Italian fashion company(3000+ points of sales worldwide)

Need: Market Basket Analysis on sold items.• Input: single invoice lines.• Output: Associative Rules to verify marketing

campaigns, seasonal shopping habits, layouts of shops, etc.

Solution: • Based on Hadoop ecosystem• Fully integrated with Business Intelligence platform

(Oracle Business Intelligence Enterprise Edition)

Page 8: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Market Basket Analysis key concepts

• Market Basket Analysis (MBA) is an application of data mining algorithms aimed at identifying frequent patterns and co-occurrence relationships.

• Given a set of input data, the MBA returns a set of association rules like

A B

The meaning of which is «If A occurs, then B is likely to occur» (in this case, «If you buy product A, you will also buy B»)

• Each rule is associated with two values that measure the degree of interest:– Support: the percentage of cases in which the two events A and B occur together on the total of the

considered cases (e.g., the number of receipts in which A and B appear together divided by the total number of receipts);

– Confidence: the percentage of cases in which the two events A and B occur together on the total of cases where A occurs (e.g., the number of receipts that contain both products A and B divided by the total number of receipts where A appears).

Page 9: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Example of associative rule

• Easywear Underwear

• Support: 9%

• Confidence: 50%

• In 9% of cases Easywear and Underwear products are sold together.

• In 50% of cases when someone purchases an Easywear item, an Underwear item is also purchased.

Page 10: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Case study: MBA for Retail

• Italian company leader in the Fashion industry

• Sales data from the last three years

• More than 100 million receipts

• The results obtained can be used as an indicator for:– Defining new promotional initiatives

– Identifying optimal schemes for the layout of goods in stores

– etc.

Page 11: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Architecture

Receipts

Associative Rules

Interactive Dashboards

MBA job

Job Management Console

Email

Number of sold items &

Associative Rules

Page 12: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

MBA Algorithm Steps

Job 1

Job 2

Job 3

List of single sold items (receipt lines)

Items list aggregated for receipts

Support of the itemsets

Map

Reduce

Map

Reduce

Map

Reduce

Receipt key, item value

Combination of items inside the same receipt

Calculation of all possible Association Rules that meet minimum Support criteria

Association Rules that meet minimum Confidence criteria

Page 13: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Job Management Interface

• Interface integrated with standard BI tool

• MBA Algorithm can run on different data sets

• Each user can perform custom analysis

• Algorithm parameters (minimum support and confidence) can be set by end users

• Examples of different analyses:

– what types of products are sold together with a discounted item?

– are there different association rules between products sold in city-center stores and those in outlets?

Page 14: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Job Management Interface

Analysis Description

Time filters

Point of Sales filters

Product filters

Attributes used for association rules

Support & Confidence parameters

Run MBA

Page 15: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Results Dashboard

Support Confidence

Page 16: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Analysis Examples

• From 01/09/2013 to 31/12/2013 marketing campaign of a new type of bra• All Italian points of sales located in city centers• Analysis between all types of item except knitwear• Min. support 35%, min. confidence 50%

Meaning: 36% of considered receipts contain all those items; when the new bra is purchased, 52 times out of 100 a slip and a babydoll are also purchased

Same configuration as before, but considering only PoS in shopping centers

Meaning: in shopping centers, the sales of easywear drive the sales of the new bra.

Rules found:new bra slip, babydoll support: 36% confidence: 52%

Rules found:Easywear new bra support: 50% confidence: 60%

Page 17: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Conclusions and future work

Conclusions

• Now business users can deeply investigate on the effectiveness of marketing and advertising campaigns and figure out whether shop windows and in-store layouts reach desired goals.

• Market Basket Analysis algorithm can be customized on users’ needs.

• Transparent interaction between Hadoop Cluster and Business Intelligence platform.

Future work: from project to solution:

• Complete framework to run complex Data Mining algorithms on Big Data.

• Hadoop to exploit parallel execution and Distributed File System.

• Seamless integration with standard Business Intelligence tools.

• More user independence on data integration.

Page 18: Real-time Market Basket Analysis for Retail with Hadoop

@IconsultingBI 

Real-Time Market Basket Analysis for Retail with

Hadoop

Page 19: Real-time Market Basket Analysis for Retail with Hadoop