predicting customer behavior with r: part 1

45
Predicting Customer Behavior with R: Part 1 Matthew Baggott, Ph.D. University of Chicago

Upload: matthew-baggott

Post on 17-May-2015

7.161 views

Category:

Documents


9 download

DESCRIPTION

An introduction to modeling non-contractual customer purchasing with BYTD package in R

TRANSCRIPT

Page 1: Predicting Customer Behavior with R: Part 1

Predicting Customer Behavior with R: Part 1

Matthew Baggott, Ph.D.University of Chicago

Page 2: Predicting Customer Behavior with R: Part 1

Goal for Today’s Workshop

• Use R and the BTYD package to make a Pareto/Negative Binomial Distribution model of customer purchasing

• Understand our assumptions and how we could refine them

Page 3: Predicting Customer Behavior with R: Part 1

Goal for Today’s WorkshopFrom this To thiscust date sales1 1997-01-01 29.331 1997-01-18 29.731 1997-08-02 14.961 1997-12-12 26.482 1997-01-01 63.342 1997-01-13 11.773 1997-01-01 6.794 1997-01-01 13.975 1997-01-01 23.946 1997-01-01 35.996 1997-01-11 32.996 1997-06-23 91.926 1997-07-22 47.086 1997-07-26 71.966 1997-10-25 78.476 1997-12-06 83.476 1998-01-18 84.46 

Page 4: Predicting Customer Behavior with R: Part 1

• Tutorial assumes working knowledge of R (but feel free to ask questions)

• Main R packages used: BTYD, plyr, ggplot2, reshape2, lubridate

• BTYD vignette covers some of the same ground

• R Script to carry out today’s analysis is at:gist.github.com/mattbaggott/5113177

Page 5: Predicting Customer Behavior with R: Part 1

Why Model?

• Help separate– active customers, – inactive customers who should be re-engaged,

and– unprofitable customers– Inactive customers

• Forecast future business profits and needs

Page 6: Predicting Customer Behavior with R: Part 1

Annual Customer ‘Defection’ Rates are High

Industry Defection Rate

Internet service providers 22%

U.S. long distance (telephone) 30%

German mobile telephone market 25%

Clothing catalogs 25%

Residential tree and lawn care 32%

Newspaper subscriptions 66%

Griffin and Lowenstein 2001

Page 7: Predicting Customer Behavior with R: Part 1

Why Not Model?

• Wübben & Wangenheim (2008) found simple rules of thumb often beat basic models

• Simple calculations can faster, clearer:

Long Term Value = (Avg Monthly Revenue per Customer * Gross Margin per Customer) / Monthly Churn Rate

Page 8: Predicting Customer Behavior with R: Part 1

But it’s All Models

• Our choice is not model vs. no model

• Our choice is formal, scalable models

vs. informal, manual models

• We can and should compare, refine, & combine simple rules and complex models

Page 9: Predicting Customer Behavior with R: Part 1

RFM Family of Models

• Models use three variables:– Recency of purchases– Frequency of purchases– Monetary value of purchases

• Used for non-contractual purchasing• Data needed: dates and amounts of

purchases for individual customers

Page 10: Predicting Customer Behavior with R: Part 1

Simple RFM model of Purchasing

1. A probabilistic purchasing process for active customers, modeled as a Poisson process with rate λ

2. A probabilistic dropout process of active customers becoming inactive, modeled as an exponential distributions with dropout rate γ

Page 11: Predicting Customer Behavior with R: Part 1

Simple RFM model of Purchasing

3. Purchasing rates follow a gamma distribution across customers with shape and scale parameters: r and α

4. Dropout rates follow a gamma distribution across customers with shape and scale parameters and sβ

5. Transaction rate λ and the dropout rate µ vary independently across customers

6. Customers are considered in isolation (no indirect value, no influencing each other)

Page 12: Predicting Customer Behavior with R: Part 1

Purchasing as a Poisson process• Single parameter indicating the constant

probability of some event• Each event is independent -- one does not

make another one more or less likely

• Other Poisson processes : e-mail arrival , radioactive decay, wars per year

(Are these realistic?) Frequency of war

Hayes, 2002

Page 13: Predicting Customer Behavior with R: Part 1

Dropout rates• Latent variable: without subscriptions, not

directly observed• ‘Right censored’ (we don’t know the future)• Fancy survival / hazard models possible

(such as Cox regression)• Here, we use a simple exponential function

with dropout rate > 0 as a constant γf(t)= γe –γt

Page 14: Predicting Customer Behavior with R: Part 1

Gamma distributions• Family of continuous probability

distributions with two parameters, shape and scale/rate.

• Often used to fit scale/rate parameters, as we do here with Poisson and exponential distributions.

Page 15: Predicting Customer Behavior with R: Part 1

Model is of repeat customers• Customers are only customers after they

make their first purchase• Frequency is not defined for first purchase• We will change purchase data log into

repeat purchase data log with dc.SplitUpElogForRepeatTrans()

or as part of dc.ElogToCbsCbt()

Page 16: Predicting Customer Behavior with R: Part 1

CDNOW Data set• We will use data from online retailer

CDNOW, included in BTYD package• 10% of the cohort of customers who made

their first transactions in the first quarter of 1997

• 6919 purchases by 2357 customers over a 78-week period

• Not too big; we won’t need to wait long

Page 17: Predicting Customer Behavior with R: Part 1

Install/load packagesInstallCandidates <- c("ggplot2", "BTYD", "reshape2", "plyr", "lubridate")# check if pkgs are already presenttoInstall <- InstallCandidates[!InstallCandidates %in% library()$results[,1]] if(length(toInstall)!=0) {install.packages(toInstall, repos = "http://cran.r-project.org")}

# load pkgslapply(InstallCandidates, library, character.only = TRUE) 

Page 18: Predicting Customer Behavior with R: Part 1

Load data

cdnowElog <- system.file("data/cdnowElog.csv", package = "BTYD") elog=read.csv(cdnowElog) # read datahead(elog) # take a lookelog<-elog[,c(2,3,5)] # we need these columns

names(elog) <- c("cust","date","sales") # model funcs expect these names # format dateelog$date <- as.Date(as.character(elog$date), format="%Y%m%d")

Page 19: Predicting Customer Behavior with R: Part 1

Aggregate by cust, dates • Our model is concerned with inter-purchase

intervals. • We only have dates (w/o times) and there

may be multiple purchases on a day• We merge all transactions that occurred on

the same day:

elog <- dc.MergeTransactionsOnSameDate(elog)

Page 20: Predicting Customer Behavior with R: Part 1

Plot dataggplot(elog, aes(x=date,y=sales,group=cust))+ geom_line(alpha=0.1)+ scale_x_date()+ scale_y_log10()+ ggtitle("Sales for individual customers")+ ylab("Sales ($, US)")+xlab("")+ theme_minimal()

(Ugly plot, but could haveRevealed data issues.)

Page 21: Predicting Customer Behavior with R: Part 1

A more useful plotpurchaseFreq <- ddply(elog, .(cust), summarize, daysBetween = as.numeric(diff(date))) windows();ggplot(purchaseFreq,aes(x=daysBetween))+ geom_histogram(fill="orange")+ xlab("Time between purchases (days)")+ theme_minimal()

Page 22: Predicting Customer Behavior with R: Part 1

Divide data into train and test(end.of.cal.period <- min(elog$date) + as.numeric((max(elog$date)- min(elog$date))/2))  # split data into train(calibration) and test (holdout) and make matricesdata <- dc.ElogToCbsCbt(elog, per="week", T.cal=end.of.cal.period, merge.same.date=TRUE, # already did this statistic = "freq") # which CBT to return # take a lookstr(data)

Page 23: Predicting Customer Behavior with R: Part 1

> str(data)List of 3 $ cal :List of 2 ..$ cbs: num [1:2357, 1:3] 2 1 0 0 0 7 1 0 2 0 ... .. ..- attr(*, "dimnames")=List of 2 .. .. ..$ : chr [1:2357] "1" "2" "3" "4" ... .. .. ..$ : chr [1:3] "x" "t.x" "T.cal" ..$ cbt: num [1:2357, 1:266] 0 0 0 0 0 0 0 0 0 0 ... .. ..- attr(*, "dimnames")=List of 2 .. .. ..$ : chr [1:2357] "1" "2" "3" "4" ... .. .. ..$ : chr [1:266] "1997-01-08" "1997-01-09" "1997-01-10" "1997-01-11" ... $ holdout :List of 2 ..$ cbt: num [1:2357, 1:272] 0 0 0 0 0 0 0 0 0 0 ... .. ..- attr(*, "dimnames")=List of 2 .. .. ..$ : chr [1:2357] "1" "2" "3" "4" ... .. .. ..$ : chr [1:272] "1997-10-01" "1997-10-02" "1997-10-03" "1997-10-04" ... ..$ cbs: num [1:2357, 1:2] 1 0 0 0 0 8 0 2 2 0 ... .. ..- attr(*, "dimnames")=List of 2 .. .. ..$ : chr [1:2357] "1" "2" "3" "4" ... .. .. ..$ : chr [1:2] "x.star" "T.star" $ cust.data:'data.frame': 2357 obs. of 5 variables: ..$ cust : int [1:2357] 1 2 3 4 5 6 7 8 9 10 ... ..$ birth.per : Date[1:2357], format: "1997-01-01" ... ..$ first.sales: num [1:2357] 29.33 63.34 6.79 13.97 23.94 ... ..$ last.date : Date[1:2357], format: "1997-08-02" ... ..$ last.sales : num [1:2357] 14.96 11.77 6.79 13.97 23.94 ...

Cal period matrix

Holdout period matrix

Customer info

Page 24: Predicting Customer Behavior with R: Part 1

Extract cbs matrix• cbs is short for "customer-by-sufficient-

statistic” matrix, with the sufficient stats being: – frequency– recency (time of last transaction) and– total time observed

cal2.cbs <- as.matrix(data[[1]][[1]])str(cal2.cbs)

(First item in list, first item in it)

Page 25: Predicting Customer Behavior with R: Part 1

Estimate parameters for model• Purchase shape and scale params: r and α• Dropout shape and scale params: β and s

# initial estimate(params2 <- pnbd.EstimateParameters(cal2.cbs))# 0.5528797 10.5838911 0.6250764 12.2011828 # look at log likelihood (LL <- pnbd.cbs.LL(params2, cal2.cbs))# -9598.711

Page 26: Predicting Customer Behavior with R: Part 1

Estimate parameters for model# make a series of estimates, see if they convergep.matrix <- c(params2, LL)for (i in 1:20) { params2 <- pnbd.EstimateParameters(cal2.cbs, params2) LL <- pnbd.cbs.LL(params2, cal2.cbs) p.matrix.row <- c(params2, LL) p.matrix <- rbind(p.matrix, p.matrix.row)}

# examinep.matrix # use final set of values(params2 <- p.matrix[dim(p.matrix)[1],1:4])

Page 27: Predicting Customer Behavior with R: Part 1

Plot iso-likelihood for param pairs# make parameter names for descriptive result # parameter names for a more descriptive resultparam.names <- c("r", "alpha", "s", "beta") LL <- pnbd.cbs.LL(params2, cal2.cbs) dc.PlotLogLikelihoodContours(pnbd.cbs.LL, params2, cal.cbs = cal2.cbs, n.divs = 5, num.contour.lines = 7, zoom.percent = 0.3, allow.neg.params = FALSE, param.names = param.names)

Page 28: Predicting Customer Behavior with R: Part 1

Plot iso-likelihood for param pairs

-106

00

-10

400

-1

020

0 -

10

000

-9

800

-9

800

-9

600

0.5 1.0 1.5 2.0

9.0

10.0

11.0

12.0

Log-likelihood contour of r and alpha

r

alph

a

-11000

-100

00

-10000

0.5 1.0 1.5 2.00.

00.

51.

01.

52.

0

Log-likelihood contour of r and s

r

s

-10

600

-10

400

-10

200

-10

000

-98

00

-98

00

-96

00

- 96

00

0.5 1.0 1.5 2.0

11.0

12.0

13.0

Log-likelihood contour of r and beta

r

beta

-10100 -10000

-9900

-9800

-9800

-9700

-9700

-9600

9.0 9.5 10.0 11.0 12.0

0.0

0.5

1.0

1.5

2.0

Log-likelihood contour of alpha and s

alpha

s

-96

06

-9

604

-960

4

-960

2

-9

602

-960

0

-960

0

9.0 9.5 10.0 11.0 12.0

11.0

12.0

13.0

Log-likelihood contour of alpha and beta

alpha

beta

-1

010

0

-1

000

0

-9

900

-9

800

-9

800

-9

700

-9

700

-96

00

0.0 0.5 1.0 1.5 2.0

11.0

12.0

13.0

Log-likelihood contour of s and beta

s

beta

Page 29: Predicting Customer Behavior with R: Part 1

Plot population estimates# par to make two plots side by sidepar(mfrow=c(1,2)) # Plot the estimated distribution of# customers' propensities to purchasepnbd.PlotTransactionRateHeterogeneity(params2,

lim = NULL)

# lim is upper xlim

# Plot estimated distribution of# customers' propensities to drop out pnbd.PlotDropoutRateHeterogeneity(params2) # set par to normalpar(mfrow = c(1,1))

Page 30: Predicting Customer Behavior with R: Part 1

Plot population estimates

0.00 0.10 0.20 0.30

05

1525

Heterogeneity in Transaction Rate

Transaction Rate

Den

sity

Mean: 0.0522 Var: 0.0049

0.00 0.10 0.20 0.300

515

25

Heterogeneity in Dropout Rate

Dropout rate

Den

sity

Mean: 0.0512 Var: 0.0042

Page 31: Predicting Customer Behavior with R: Part 1

Examine individual predictions# predicted num. transactions a new customer # will make in 52 weekspnbd.Expectation(params2, t = 52) # expected characteristics for customer 1516, # conditional on their purchasing during calibration  cal2.cbs["1516",]x <- cal2.cbs["1516", "x"] # x is frequencyt.x <- cal2.cbs["1516", "t.x"] # t.x is time last buyT.cal <- cal2.cbs["1516", "T.cal"] # T.cal is time observed # estimate their transactions in a T.star durationpnbd.ConditionalExpectedTransactions(params2, T.star = 52, # weeks

x, t.x, T.cal)# [1] 25.24912

Page 32: Predicting Customer Behavior with R: Part 1

Probability a customer is ‘alive’x # freq of purchaset.x # week of last purchaseT.cal <- 39 # week of end of cal, i.e. presentpnbd.PAlive(params2, x, t.x, T.cal) # To visualize the distribution of P(Alive) # across customers:params3 <- pnbd.EstimateParameters(cal2.cbs)p.alives <- pnbd.PAlive(params3, cal2.cbs[,"x"], cal2.cbs[,"t.x"], cal2.cbs[,"T.cal"])

Page 33: Predicting Customer Behavior with R: Part 1

Plot P(Alive)ggplot(as.data.frame(p.alives),aes(x=p.alives))+ geom_histogram(colour="grey", fill="orange")+ ylab("Number of Customers")+ xlab("Probability Customer is 'Live'")+ theme_min\imal()

0

200

400

600

0.0 0.3 0.6 0.9Probability Customer is 'Live'

Num

ber

of C

usto

mer

s

Page 34: Predicting Customer Behavior with R: Part 1

Plot Observed, Model Transactions# plot actual & expected customers binned by # num of repeat transactionspnbd.PlotFrequencyInCalibration(params2, cal2.cbs, censor=10, title="Model vs. Reality during Calibration")

0 1 2 3 4 5 6 7 8 9 10+

Model vs. Reality during Calibration

Calibration period transactions

Cus

tom

ers

050

015

00

ActualModel

Page 35: Predicting Customer Behavior with R: Part 1

Compare calibration to holdout

• Note of caution: potential overfitting – Our gamma distributions are based on the

specific customers we had during calibration.– How would our parameters and predictions

change with different customers?– We will addresses this in Part 2

Page 36: Predicting Customer Behavior with R: Part 1

Get holdout results, duration# get holdout transactions from dataframe data, # add in as x.star x.star <- data[[2]][[2]][,1]cal2.cbs <- cbind(cal2.cbs, x.star)

str(cal2.cbs) holdoutdates <- attributes(data[[2]][[1]])[[2]][[2]]holdoutlength <- round(as.numeric(max(as.Date(holdoutdates))- min(as.Date(holdoutdates)))/7)

Page 37: Predicting Customer Behavior with R: Part 1

Plot frequency comparison# plot predicted vs seen conditional freqs  T.star <- holdoutlengthcensor <- 10 # Bin all order numbers here and abovecomp <- pnbd.PlotFreqVsConditionalExpectedFrequency(params2,

T.star, cal2.cbs, x.star, censor)

02

46

810

Conditional Expectation

Calibration period transactions

Hol

dout

per

iod

tran

sact

ions

0 1 2 3 4 5 6 7 8 9 10+

ActualModel

Page 38: Predicting Customer Behavior with R: Part 1

Examine accompanying matrix

rownames(comp) <- c("act", "exp", "bin")comp

freq.0 freq.1 freq.2 freq.3 freq.4 freq.5act 0.2367116 0.6970387 1.392523 1.560000 2.532258 2.947368exp 0.1367795 0.5921279 1.181825 1.693969 2.372472 2.876888bin 1411.0000000 439.0000000 214.000000 100.000000 62.000000 38.000000 freq.6 freq.7 freq.8 freq.9 freq.10+act 3.862069 4.913043 3.714286 8.400000 7.793103exp 3.776675 4.167163 5.698026 5.487862 8.369321bin 29.000000 23.000000 7.000000 5.000000 29.000000

• Bin size in that plot can be seen in comp matrix:

Page 39: Predicting Customer Behavior with R: Part 1

Compare Weekly transactions # get data without first transaction: removes those who buy 1xremovedFirst.elog <- dc.SplitUpElogForRepeatTrans(elog)$repeat.trans.elogremovedFirst.cbt <- dc.CreateFreqCBT(removedFirst.elog)

# get all data, so we have customers who buy 1xallCust.cbt <- dc.CreateFreqCBT(elog) # add 1x customers into matrixtot.cbt <- dc.MergeCustomers(data.correct=allCust.cbt, data.to.correct=removedFirst.cbt)  lengthInDays <- as.numeric(max(as.Date(colnames(tot.cbt)))- min(as.Date(colnames(tot.cbt))))origin <- min(as.Date(colnames(tot.cbt))) 

Page 40: Predicting Customer Behavior with R: Part 1

Compare Weekly transactions tot.cbt.df <- melt(tot.cbt,varnames = c("cust","date"), value.name="Freq")

tot.cbt.df$date <- as.Date(tot.cbt.df$date)tot.cbt.df$week <- as.numeric(1 + floor((tot.cbt.df$date-origin+1)/7)) transactByDay <- ddply(tot.cbt.df,.(date),summarize,sum(Freq))transactByWeek <- ddply(tot.cbt.df,.(week),summarize,sum(Freq))names(transactByWeek) <- c("week","Transactions")names(transactByDay) <- c("date","Transactions")  T.cal <- cal2.cbs[,"T.cal"]T.tot <- 78 # end of holdoutcomparisonByWeek <- pnbd.PlotTrackingInc(params2, T.cal, T.tot, actual.inc.tracking.data = transactByWeek$Transactions)

Page 41: Predicting Customer Behavior with R: Part 1

Compare Weekly transactions

Page 42: Predicting Customer Behavior with R: Part 1

Formal Measures of Accuracy# root mean squared errorrmse <- function(est, act) { return(sqrt(mean((est-act)^2))) } # mean squared logarithmic errormsle <- function(est, act) { return(mean((log1p(est)- log1p(act))^2)) }  Predict <- pnbd.ConditionalExpectedTransactions(params2, T.star = 38, # weeks x = cal2.cbs[,"x"], t.x = cal2.cbs[,"t.x"], T.cal = cal2.cbs[,"T.cal"]) cal2.cbs[,"x.star"] # actual transactions for each person  rmse(act=cal2.cbs[,"x.star"],est=predict)msle(act=cal2.cbs[,"x.star"],est=predict)

Measures not really meaningful without some comparison

Page 43: Predicting Customer Behavior with R: Part 1

Next Week:

• Compare results to a simple model• Estimate of expenditure / customer value• Use info about clumpiness of purchase patterns (as

in Platzer 2008)• Use info about seasonality of purchasing, with

forecast package• Improve model predictions with machine learning

techniques:– Cross-validation to avoid over-fitting– Combining model predictions

Page 44: Predicting Customer Behavior with R: Part 1

References

• Griffin and Lowenstein (2001), Customer Winback: How to Recapture Lost Customers—And Keep Them Loyal. San Francisco: Jossey-Bass.

• Platzer (2008). “Stochastic models of noncontractual consumer relationships.” Master of Science in Business Administration thesis, Vienna University of Economics and Business Administration, Austria.

• Schmittlein, Morrison, and Colombo (1987). Counting Your Customers: Who Are They and What Will They Do Next? Management Science, 33, 1–24.

• Wang, Gao, and Li (2010). Empirical analysis of customer behaviors in Chinese e-commerce. Journal of networks 5.10: 1177-1184.

• Wübben & Wangenheim (2008). Instant customer base analysis: Managerial heuristics often “get it right”. Journal of Marketing, 72(3), 82-93.

• Zhang, Y., Bradlow, E. T., & Small, D. S. (2012). New Measures of Clumpiness for Incidence Data.

Page 45: Predicting Customer Behavior with R: Part 1

Purchase rate often depends on type of purchase

1.1 million purchases on 360buy.com from Wang, Gao, & Li 2010